R Basics

Basic Objects

Vectors (Classes)

Type
Example

Logical

TRUE, FALSE

Numeric

1, 55, 999

Integer

1L, 32L, 0L

Complex

2 + 3i

Character

"great", "23.4"

Creating a vector & print statement example

apple <- c('red','green','yellow')

print(apple)

# Get the class of the vector
print(class(apple))

Output:

"red"  "green"  "yellow"
"character"

Anchors

Anchor
Special Character
Example

Beginning of line

^

^New

End of line

\\$

y$

Beginning of string

\\A

\\AHello

End of string

\\Z or \\z

End\\Z

Lists

Contain many different types of elements inside

Output:

Matrices

Two-dimensional rectangular data set

Output:

Arrays

Multi-dimensional data set

Output:

Factors

Stores the vector along with the distinct values of the elements in the vector as labels

Output:

Data Frames

Tabular data objects

Output:

Loops

Repeat Loop

While Loop

For Loop

Working with Data

Useful Packages

Tasks
Lists

Load data

utils, openxlsx, foreign, haven

Manipulate data

tidyverse, dplyr,tidyr

Visualize data

ggplot2, lattice, plotly

Modeling

2 + 3i

Character

glmnet, randomForest, caret, survival

Import Data

For the following examples, use package called datasets

Data Exploration

Use dataset called mtcars from package datasets

  • str(data): gives a quick overview of the rows and columns of the dataset.

Output:

  • head(data,n) and tail(data,n)

head(): Top n elements

tail(): Bottom n elements

Output:

Descriptive Statistics

  • summary(data): gives descriptive statistics for each variable

Common Functions

Tasks
Functions

Mean

mean()

Standard deviation

sd()

Variance

var()

Minimum

min()

Maximum

max()

Median

median()

Range of values

range()

Sample quantiles

quantile()

Interquartile range

IQR()

Case of missing values

  • na.rm = T

Output:

Basic Plots

  • plot()

  • barplot()

Output:

  • histogram()

Output:

  • boxplot()

Output:

  • qqplot() or qqnorm(): check whether the data is normally distributed

  • qqline(): adds a reference line

Output:

Statistical Analysis

Analysis
Continuous Outcome(Y)
Binary Outcome(Y)

Correlation Analysis

X: Continuous

cor.test()

t.test()

X: Categorical

t.test(), ANOVA()

chisq.test()

Regression Model

lm()

glm()

Last updated