Data Analysis and Manipulation

Notes:

This page will go over much of the same content as the DataFrames R page, but using tidyverse's dplyr and tidyr packages rather than base R. You may notice that pipes (%>%) are used more often here. Pipes are functionally the same as other elements like summary() or $, but tend to be the predominant syntax for more advanced uses of R, particularly in the tidyverse, as they can help chain multiple operations in the same line of code.

Loading tidyverse modules:

In order to use the tidyverse modules, they first have to be installed. Ensure that the following code is at the top of your coding environment:

#Load tidyverse and required modulees
install.packages("tidyverse")
library(tidyverse)
library(dplyr)
library(tidyr)

Create DataFrame:

Input:

#Create DataFrame
df <- tibble(
  id = 1:5,
  gender = c("F", "M", "F", "M", "F"),
  age = c(68, 54, 49, 28, 36)
  )
df

Output:

Describe DataFrame:

Input:

Output:

Accessing specific DataFrame subsets:

Input:

Output:

Adding Columns:

Input:

Output:

Transform DataFrame:

Input:

Output:

Traversing DataFrame (for loops):

Input:

Output:

Last updated