Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
R is one of the many languages used by the data science community to perform data manipulation, statistical modeling and machine learning. R was designed by statisticians for statistical computing.
For most users, it is recommended to download the current stable release from https://cloud.r-project.org/.
Some developers might wish to use a different version, or to switch between versions. For this, the rvenv package can be useful.
R is also available for use in Brown's Computing Environments:
Oscar (for high-performance computing)
Stronghold (for secure computing)
Download and install the latest version of The R Project for Statistical computing for macOS here.
For an integrated development environment (IDE) / graphical interface, you can also download and install R Studio from here.
R comes with a full-featured interactive command-line REPL (read-eval-print loop) built into the
R
executable. In addition to allowing quick and easy evaluation of R statements, it has a searchable history, tab-completion, many helpful keybindings, and dedicated help?
and shell modes;
.
This page provides examples of using REPL on the command line.
Type "module load r" in terminal to load the R module, then on a new line type "R" to launch R
In terminal, q() quits the R module
Type "?" or help(function) to enter help pages within R's REPL
For example, to ask for help with linear functions in R, use help(lm) (output shown below)
This is the typical first program for those new to a programming language. It can be used to test that the of R is working and also introduce R's basic syntax using the environment or running code written using a at the command line.
Operator | Description | Example |
---|
Type | Example |
---|
Unlike other languages, R does not require the use of print statements to output code, but it does allow them. To print, you can simply write code, or include the code you want to be printed in a print() statement.
We can write comments on our code, which do not run, to describe what certain lines of code or section of code do. These comments are just for the programmer- they will not appear anywhere in the output and simply explain what the code is doing or provide helpful notes.
To comment in R, use the “#” symbol and type your comment on the same line
R has no syntax for multi-line comments, so each line that is commented out needs a "#" symbol at the beginning
R Documentation:
R Documentation:
Operator | Description |
---|
R Documentation:
R Documentation:
> | Greater than |
< | Less than |
>= | Greater than or equal |
<= | Less than or equal |
== | Exactly equal |
!= | Not equal to |
& | Entry wise and |
<- or = or <<- | Left Assignment | x <- 7, x = 7, x <<- 7 |
-> or ->> | Right Assignment | x -> 7, x ->> 7 |
Logical | TRUE, FALSE |
Numeric | 1, 55, 999 |
Integer | 1L, 32L, 0L |
Complex | 2 + 3i |
Character | "great", "23.4" |
Addition | + |
Subtraction | - |
Multiplication | * |
Division | / |
Power (Exponent) | ^ or ** |
Remainder (Modulo) | %% |
Negation (for Bool) | !x |
In computer programming, a package is a collection of modules or programs that are often published as tools for a range of common use cases, such as text processing and doing math. Programmers can install these packages and take advantage of their functionality within their own code.
This page includes instructions for installing packages in R and a description of some of R's most frequently used packages.
To install a package in R, you can either:
Use the install.packages("PackageName") function if you have the package downloaded locally on your machine
Or if you are using RStudio, you can use Tools > Install packages, enter in the package name and click Install
Once you install the package, you have to load it into your library using the libary(PackageName) function.
In R, tidyverse is one of the most popular packages, as it contains an assortment of packages used for data science, such as:
ggplot2, used to create graphics and data visualization
dplyr, contains functions used for data manipulation, like mutate() and filter()
tidyr, used for data organization and cleaning
tibble, an optimized dataframe visualizer
readxl, can be used to input Excel files in .xlsx format into R
R Documentation: Packages
When coding in R, you will often need to input datasets to work with! The easiest ways to do so are either from a .csv file or a .txt file. To do this, you can use the read.csv() and read_table() functions, respectively. The following demonstrates these functions using a hypothetical "hospital_data" dataset.
To output a file from R, use the syntax sink("FileName.FileType").
R Documentation: read.csv file input
More read.csv resources here
R Documentation: read_table file input
R Documentation: File output
Used to test if a specific case is true or false
Short-circuit evaluation:
Test if all conditions are true
Test if any conditions are true
Test if a condition is not true
If statement: run code if this statement is true
Only used at the beginning of a conditional statement
Else if statement: if previous statements aren't true, try this
Can be used an unlimited number of times in an if statement
Else statement: catch-all for anything outside of prior statements
Only used to end a conditional statement
Repeats a block of code a specified number of times or until some condition is met
While loop
For loop
Use break to terminate loop
R Documentation: Conditional Execution
R Documentation: Repetitive Execution
Lists in R are ordered collections of data that can be of different classes.
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Get string length
nchar(string)
Combine two strings
str_c(string1, string2)
Sort values within a string
sort(string1, string2, string3)
Search for a substring within a string
grep(substring/value, string)
Replace a single value within a string
sub(pattern, replacement, string)
Replace all instances within a string
gsub(pattern, replacement, string)
Find matches for exact string
grepl(pattern, string)
New list (empty)
listname <- list()
New list (misc)
listname <- list(1L, "abc", 10.3)
Access an element
list[position]
Change a value
list[position] <- newvalue
See number of values in a list
length(list)
See if item is present in a list
item %in% list
Add item to a list
append(list)
Add item to a list at a specific position
append(list, after=index number)
Remove item from list
newlist <- list[-index number]
New matrix (empty)
matrixname <- matrix()
New matrix (numbers)
matrixname <- matrix(data, nrow=, ncol=)
New matrix (strings)
matrixname <- matrix(data, nrow=, ncol=)
Access a matrix element
matrix[row position, column position]
Access an entire row
matrix[row position,]
Access an entire column
matrix[,column position]
Create an additional row
rbind(matrix, values for new row)
Create an additional column
cbind(matrix, values for new column)
New array (empty)
arrayname <- array()
New array (numbers)
arrayname <- array(data, dim(nrow=, ncol=, ndim=)
New array (strings
arrayname <- array(data, dim(nrow=, ncol=, ndim=)
Access an array element
array[row position, column position, dimension]
Check if an item exists
value %in% array
Sort array increasing
sort(array)
Sort array decreasing
sort(array, decreasing = TRUE)