Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
All major operating systems organize files into hierarchical directories. Understanding these file directory structures is vital when interacting with data files using Unix commands or a programming language.
This page describes file directory structures generally as well as some of the differences between file directory structures within different operating systems.
Directories allow users to group files into an organized structure. They are typically visualized like root systems of trees, the highest level of which is called the "root directory". Subdirectories branch down from the root directory, containing files as well as additional subdirectories.
Directories and files are typically described using the path used to reach them through the directory structure, starting with the root directory. In Linux and Mac operating systems, the root directory is indicated as "/" (In Windows OS, the root directory is indicated as "\"). An additional "/" (or "\" for Windows OS) is placed between each object in the path.
For example, looking at Figure 1, File_B1a2 could be described with:
/Directory_B/Directory_B1/Directory_B1a/File_B1a2
All major operating systems also provide users with a graphical user interface, or GUI (often pronounced "gooey"), which allows interaction with software and files through visual icons. If you are not already familiar with accessing files and directories through the command line, you are likely familiar with using a GUI file system. While not the recommended method for interacting with files while programming, the GUI file system can be a useful tool for visualizing a directory structure.
Figure 2 displays the GUI file system for a computer running MacOS. Though the GUI directory structure is visualized horizontally, the "root system" is still clearly visible. Using its complete path, the file "medication_data" should be described as:
/Users/<username>/Documents/project_a/data_files/medication_data
GitHub is a code hosting platform that allows developers to create, store, manage, and share their code. It uses Git software, providing the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous integration, and wikis for every project. Refer to GitHub Docs for additional GitHub documentation and tutorials.
Like other cloud platforms (e.g., Google Docs), GitHub allows users to work on projects together. Please note, code changes must be manually saved. GitHub does not automatically save your work. To save changes, open the Terminal application, navigate to the cloned repository, and run the following commands, replacing "INSERT PROGRESS NOTE" with brief description of changes.
git add -A
: adds all your code changes to the GitHub repository
git commit -m"INSERT PROGRESS NOTE"
: adds a note to the commit which you and your team can reference later. This note should be brief and informative, describing the purpose of your code changes.
git push
: saves your code changes to the GitHub repository.
If multiple users are pushing code changes to your GitHub repository, make sure to retrieve or "pull" these edits before you begin making code changes. To do so, open the Terminal application, navigate to the cloned repository, and run the following command. If you have made any code changes, you will need to save them first for the pull to work.
When your are making code changes, you should git pull
before making any edits. This will keep your team from encountering "merge conflicts", which can become difficult to troubleshoot. To mitigate merge conflicts, make sure to communicate with your team. Inform your team whenever you push new code changes so that everyone is always working one the most updated version of the code.
Merge conflicts happen when you attempt to merge code branches that have competing commits. They are often caused by users making code changes without pulling first. To resolve a merge conflict, work through the following steps:
Identify the location of the merge conflict.
Manually edit the conflicted file from a single machine, selecting the changes you want to keep in the final merge.
Push the selected changes to GitHub.
All team members should pull the corrected changes from GitHub before continuing to make code changes.
Julia is an open source dynamic programming language for high-level, high-performance numerical computing [1]. Julia provides ease and expressiveness (similar to R, MATLAB, and Python), but also supports general programming [2].
Development of Julia began in 2009, and the first version was released in February 2012. The current version of Julia is 1.11 (as of November 2024).
Learn X in Y Minutes: X=Julia
Programming languages are written using text editor applications. These applications allow users to create and edit free text, which can then be run as programs. Text editors differ in complexity, some including extra functionality for easier, more efficient programming. Text editors with auto-complete suggest common functions or existing variables as the programmer begins to type, which the programmer can then select without needing to finish typing. Some text editors offer options to run individual lines of code or entire programs while editing files.
Available for Mac, Windows, and Linux operating systems
Includes support for debugging, syntax highlighting, auto-complete, and additional user-friendly functionality
Web application text editor, no download necessary
Includes options for interactive output (HTML, images, videos, LaTeX, and custom MIME types), support for big data tools, such as Apache Spark, and options for sharing notebooks with others
Run individual lines of code or entire programs at once
Highly configurable
Included in most UNIX operating systems (e.g., Linux, or MacOS), no download necessary
Write files from the Terminal
Highly configurable
Included in most UNIX operating systems (e.g., Linux, or MacOS), no download necessary, also available for Windows
Wide range of built-in features for text editing, such as syntax highlighting, automatic indentation, and search and replace
Included in most UNIX operating systems (e.g., Linux, or MacOS), no download necessary
Most of the editing commands are displayed at the bottom of the editing screen for easy reference
Unix is a family of operating systems officially trademarked as UNIX®. These operating systems are computing environments that are optimized for multi-tasking across multiple users. The original system was developed by AT&T in 1969 as a text only system. There are many Unix variants or Unix-like systems (e.g. GNU/Linux, Sun Solaris, IBM AIX, and Mac OS X). On Windows, Cygwin is a program that provides a Unix-like environment.
The main components of a Unix operating system include:
Kernel – bridge between hardware (i.e. silicon) and application (i.e. software)
Shell – command line interface to enable user interaction with the system
File System – the organization structure for how files are stored
The Unix file system organizes files and directories into a hierarchical structure like the root system of a tree.
The "root" directory (e.g. "/") is the top of the hierarchy.
Standard directories within the root directory:
/bin
and /usr
contain commands needed by system administrators and users
/etc
contains system-wide configuration files and system databases
/home
contains the home directory (~
) for each user (In some systems, the home directories may be in a different location such as /users
or /Users
)
When traversing directories
working directory (.) is the directory that a user currently is in
parent directory (..) is the directory above the working directory
path or pathname specifies where a user is in the file system
full path or absolute path points to the same location regardless of the working directory (i.e., it is written in reference to the root directory)
relative path is the path relative to the working directory
If the working directory is the home
directory for bcbi
, the full path for the course
directory is /home/bcbi/course
while the relative path is just course
. A schematic of this is below:
If code
then becomes the working directory, the full path for the data directory from there is /home/bcbi/course/data
while the relative path is ../data
. A schematic of this is below:
The Unix shell provides a command line interface for interacting with the operating system and is where commands are entered. An example below is a Mac OS X Terminal Shell logged into a RedHat Linux Server as user_name
.
The prompt may look different depending on your shell (e.g., Bourne shell [sh], C shell [csh], or Bourne-Again shell [bash])
Default prompts include $
and %
The prompt #
typically appears when logged in as the superuser
or root user
who can do anything on the system, so should be restricted to trusted users, used only when necessary and with caution. While you may be able to do this on a system you control, you are unlikely to ever have root priviledges on a shared computing resource (e.g. Oscar or Stronghold at Brown University)
The prompt can be configured to include additional information such as hostname, username, and pathname (e.g., computer:/home/bcbi/course bcbi $
).
There are many Unix commands. Some commands will display output and then return to the shell prompt while others will just return to the shell prompt to indicate that it has executed the last command.
Unix command syntax:
Case-sensitive (pwd ≠ PWD)
May involve one or more arguments
Argument may be an option (or flag or switch) for that command
Argument may be a file or directory
To get to a Unix shell on your computer:
For Mac, launch the Terminal application (under Applications → Utilities → Terminal)
For Linux, launch the Terminal application
For Windows, launch the PowerShell application
Get help from manual (man) pages on commands: (Use spacebar or up and down arrows to scroll through pages and then press q
to quit)
Determine what directory you are currently in with pwd (present working directory):
Get a listing of current directory contents using ls:
Create course directory using mkdir: (Replace course with class name - e.g., methods2020 or biol6535)
Get a listing of current directory contents with details using ls:
Change into course directory using cd: (Replace course with class name - e.g., methods2020 or biol6535)
Analyze the MIMIC-IV Demo Files Using Unix Commands - Forthcoming!
Analyze the SyntheticRI Demo Files Using Unix - Forthcoming!
The chapter provides instructions and examples of using computing skills for health data and technology research.
Visit other chapters in CODIAC for Health using the or menu in the upper left corner.
Command | Action (with sftp specific notes) |
---|
Command | Action |
---|
Command | Action |
---|
Keys | Action |
---|
Command | Action |
---|
Command | Action |
---|
Brown CCV:
Brown CCV:
| directory listing (remotely in sftp) |
| local directory listing (sftp only) |
| formatted listing with hidden files |
| change directory to dir (remotely in sftp) |
| change local directory to dir (sftp only) |
| change to home (remotely in sftp) |
| show current directory (remote directory in sftp) |
| show current local directory |
| create a directory dir |
| delete file |
| delete directory dir |
| force remove file |
| force remove directory dir * |
| copy file1 to file2 |
| copy dir1 to dir2; create dir2 if it doesn't exist |
| rename or move file1 to file2 if file2 is an existing n directory, moves file1 into directory file2 |
| copy local file to current remote directory (sftp only) |
| copy remote file to current local directory (sftp only) |
| show the current date and time |
| show this month's calendar |
| show current uptime |
| display who is online |
| who you are logged in as |
| counts the number of lines, words, bytes in file |
| counts the number of lines in file |
| cut out selected portions (first head ) of each line of a tab-delimited file |
| cut out columns 1,2, and 3 from a pipe-delimited file |
| sort lines of text file file |
| report or filter out repeated lines in a file |
| search for pattern in files |
| search for lines that do not contain pattern in files |
| manipulate data and generate reports |
| text stream editor |
| go to beginning of current command |
| go to end of current command |
| halts the current command |
| stops the current command, resume with fg in the foreground or bg in the background |
| log out of current session, similar to exit |
| erases one word in the current line |
| erases the whole line |
| type to bring up a recent command |
| repeats the last command |
| log out of current session |
| displays file contents one screen at a time (similar to |
| displays the first few lines of a file. |
| displays the last few lines of a file. |
| change the permissions (in either a ssh or sftp session) of file to octal, which can be found separately for user, group, and world by adding: |
| read (r) |
| write (w) |
| execute (x) |
Julia comes with a full-featured interactive command-line REPL (read-eval-print loop) built into the
julia
executable. In addition to allowing quick and easy evaluation of Julia statements, it has a searchable history, tab-completion, many helpful keybindings, and dedicated help?
and shell modes;
. [1]
This page provides examples of using REPL on the command line.
Type julia
in terminal to launch REPL
Type "?" to enter help pages within REPL
Type a function from Julia to read help pages (ex: println
)
Julia Contributors. (n.d.). REPL - Standard Library - Julia Language. Retrieved May 1, 2024, from https://docs.julialang.org/en/v1/stdlib/REPL/
Julia Documentation: The Julia REPL
Julia Cheat Sheet (see REPL)
Instructions for installing Julia on macOS and Windows operating systems can be found .
Package managers such as (macOS and Linux) and (Windows) can be used to facilitate installation.
For most users, it is recommended to download the current stable release from .
Some developers might wish to use a different version, or to switch between versions. For this, the can be useful.
Julia is also available for use in Brown's :
Oscar (for high-performance computing)
Stronghold (for secure computing)
This is the typical first program for those new to a general purpose programming language like Julia. It can be used to test that the of Julia is working and also introduce Julia's basic syntax using the environment or running code written using a at the command line.
Input:
Output:
Here are variations of the "Hello, World!" programming using variables and different print statements.
Input:
Output:
In order to assign variables in Julia, you write the desired name for your variable, an =
sign, and what the value of the variable should be.
Input:
Output:
We can write comments on our code, which do not run, to describe what certain lines of code or section of code do
These comments are just for the programmer, they will not appear anywhere in the output and just are there to explain what the code is doing or to provide helpful notes
To make a comment in Julia, you can use the “#” symbol and then type your comment
Sometimes you might want to write longer comments that span multiple lines – to do this you can surround these comments with #=
above the start as well as =#
below the end
Input:
Output:
Without using a print statement, Julia will only print out the most recent item that has an output. In order to print multiple things, we can use the print()
or println()
functions.
Input:
Output:
Use Julia in Brown Oscar Computing Environment - Forthcoming!
Use Julia in Brown Stronghold Computing Environment - Forthcoming!
This page provides syntax for strings and characters in Julia as well as some of their associated functions. Each section includes an example to demonstrate the described syntax or function.
Char
is a single character
String
is a sequence of one or more characters (index values start at 1
)
Action | Function |
---|
Use typeof()
function to determine type
Input:
Output:
This page provides syntax for using numbers and mathematic operations in Julia. Each section includes an example to demonstrate the described syntax and operations.
Integer (positive and negative counting number) - e.g., -3, -2, -1, 0, 1, 2, and 3
Signed: Int8, Int16, Int32, Int64, and Int128
Unsigned: UInt8, UInt16, UInt32, UInt64, and UInt128
Boolean: Bool
(0 = False and 1 = True)
Float (real or floating point numbers) - e.g., -2.14, 0.0, and 3.777
Float16, Float32, Float64
Use typeof()
function to determine type
Input:
Output:
Input:
Output:
Input:
Output:
Create a Health Calculator Using Julia - Forthcoming!
Regular expressions (regex) are powerful tools for pattern matching and text processing. They are represented as a pattern that consists of a special set of characters to search for in a string
str
.
This page provides syntax for regular expressions in Julia . Each section includes an example to demonstrate the described methods.
Action | Function |
---|
Character class specifies a list of characters to match ([...]
where ...
represents the list) or not match ([^...]
)
Anchors are special characters that can be used to match a pattern at a specified position
Repetition or quantifier characters specify the number of times to match a particular character or set of characters
Input:
Output:
In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated.
This page provides syntax for some of the common control flow methods in Julia . Each section includes an example to demonstrate the described methods.
Test if a specified expression is true or false
Short-circuit evaluation
Test if all of the conditions are true x && y
Test if any of the conditions are true x || y
Test if a condition is not true !z
Conditional evaluation
if
statement
if-else
if-elseif-else
?:
(ternary operator)
Input:
Output:
Repeat a block of code a specified number of times or until some condition is met.
while
loop
for
loop
Use break
to terminate loop
Input:
Output:
Input:
Output:
Julia Documentation:
Julia Documentation:
Think Julia:
Think Julia:
Julia Documentation:
Julia Documentation:
Think Julia:
Operator | Example |
---|
Operator | Example |
---|
Julia Documentation:
Julia Documentation:
Julia Documentation:
Julia Documentation:
Think Julia:
Anchor | Special Character |
---|
Repetition | Character |
---|
Julia Documentation: (see Regular Expressions)
Think Julia:
Operator | Example |
---|
Wikipedia contributors. (n.d.). Control flow. In Wikipedia. Retrieved May 1, 2024, from
Julia Documentation:
Think Julia:
Think Julia:
Addition | x + y |
Subtraction | x - y |
Multiplication | x * y |
Division | x / y |
Power (Exponent) | x ^ y |
Remainder (Modulo) | x % y |
Negation (for Bool) | !x |
Equality | x == y or isequal(x, y) |
Inequality | x != y or !isequal (x, y) |
Less than | x < y |
Less than or equal to | x <= y |
Greater than | x > y |
Greater than or equal to | x >= y |
Check if regex matches a string |
|
Capture regex matches |
|
Specify alternative regex |
|
Character Class |
|
Any lowercase vowel |
|
Any digit |
|
Any lowercase letter |
|
Any uppercase letter |
|
Any digit, lowercase letter, or uppercase letter |
|
Anything except a lowercase vowel |
|
Anything except a digit |
|
Anything except a space |
|
Any character |
|
Any word character (equivalent to |
|
Any non-word character (equivalent to |
|
A digit character (equivalent to |
|
Any non-digit character (equivalent to |
|
Any whitespace character (equivalent to |
|
Any non-whitespace character (equivalent to |
|
Beginning of line |
|
End of line |
|
Beginning of string |
|
End of string |
|
Zero or more times |
|
One or more times |
|
Zero or one time |
|
Exactly n times |
|
n or more times |
|
m or less times |
|
At least n and at most m times |
|
Equality | x == y or isequal(x, y) |
Inequality | x != y or !isequal (x, y) |
Less than | x < y |
Less than or equal to | x <= y |
Greater than | x > y |
Greater than or equal to | x >= y |
get |
|
extract |
|
extract substring |
|
search for |
|
search for |
|
remove record separator from |
|
remove last character from |
|
Python is one of the many languages used by the data science community to perform data manipulation, statistical modeling and machine learning. Its design philosophy emphasizes code readability. The python community is huge, offering an enormous library of technical support documentation. If you don't know how to do something in Python, chances are, someone else asked a similar question online and received a comprehensive answer.
List of exercises found across the different Julia pages.
Use Julia in Brown Oscar Computing Environment - Forthcoming!
Use Julia in Brown Stronghold Computing Environment - Forthcoming!
Create a Health Calculator Using Julia - Forthcoming!
Create a Pediatric Dosage Calculator Using Julia
Create a BMI Calculator Using Julia
Analyze Health Datasets Using Unix Commands - Forthcoming!
Analyze MIMIC-IV Demo Files Using Unix Commands
Analyze SyntheticRI Demo Files Using Unix
Analyze Health Datasets Using Julia - Forthcoming!
Analyze MIMIC-IV Demo Files Using Julia
Analyze SyntheticRI Demo Files Using Julia
In computer programming, a package is a collection of modules or programs that are often published as tools for a range of common use cases, such as text processing and doing math. Programmers can install these packages and take advantage of their functionality within their own code.
This page provides instructions for installing, using, and troubleshooting packages in Julia.
Start Julia REPL by typing the following in Terminal or PowerShell (Note: do not need to type $ - this is to indicate the shell prompt)
Go into REPL mode for Pkg, Julia’s built in package manager, by pressing ]
Update package repository in Pkg REPL
Add packages in Pkg REPL
Check installation
Get back to the Julia REPL and exit by pressing backspace or ^C.
To see REPL history
If you get an error like: ERROR: SystemError: opening file "C:\\Users\\User\\.julia\\registries\\General\\Registry.toml"
: No such file or directory
Delete C:\\Users\\User\\.julia\\registries
where User is your computer’s username and try again
https://discourse.julialang.org/t/registry-toml-missing/24152
JuliaHealth and BioJulia organizations (focused on Julia packages for health and life sciences)
Julia Package: CSV.jl
Julia Package: DataFrames.jl
Many Julia programs involve the input and output of files. When analyzing a dataset, that dataset file will need to be pulled into your program (input). If you want to see the results of your analysis, your program will need an output.
This section provides the syntax for inputing files (reading) and outputting results (writing) use base Julia (i.e., no packages such as CSV.jl).
Tabulate and report counts for sex in Adult Data Set from the UC Irvine Machine Learning Repository.
Dataset (example lines from adult.data
)
Input (process_file.jl
)
Output
Terminal
Analyze the MIMIC-IV Demo Files Using Julia - Forthcoming!
Analyze the SyntheticRI Demo Files Using Julia - Forthcoming!
Julia Documentation: Base - I/O and Network
Think Julia: Chapter 14 - Files
DataFrames.jl is a Julia package that provides a set of tools for working with tabular data in Julia. Its design and functionality are similar to those of pandas (in Python) and
data.frame
,data.table
and dplyr (in R), making it a great general purpose data science tool. [1]
This page provides examples of using DataFrames.jl, demonstrating the syntax and common functions within the package.
Install and Load DataFrames.jl Package
Create Dataframe
Display Dataframe
Input:
Output:
First two lines of dataframe:
Input:
Output:
Last two lines of dataframe:
Input:
Output:
Describe Dataframe
Dataframe size:
Input:
Output:
Dataframe column names:
Input:
Output:
Dataframe description:
Input:
Output:
Accessing DataFrames
Get "age" column (different ways to call the column)
Input:
Output:
Get row
Input:
Output:
Get element
Input:
Output:
Get subset (specific rows and all columns)
Input:
Output:
Get subset (all rows and specific columns)
Input:
Output:
Get subset (all rows meeting specified criteria - numbers)
Input:
Output:
Get subset (all rows meeting specified criteria - strings)
Input:
Output:
Get subset (all rows meeting specified criteria)
Input:
Output:
Add Column
New columns with specified values
Input:
Output:
New column with calculated value
Input:
Output:
Get counts/frequency
Input:
Output:
Transform DataFrame
sort
Input:
Output:
stack (reshape from wide to long format)
Input:
Output:
unstack (reshape from long to wide format)
Input:
Output:
Traversing DataFrame (for loops)
sort
Input:
Output:
Analyzing Health Datasets with DataFrames in Julia - Forthcoming!
JuliaData Contributors. (n.d.). DataFrames.jl - JuliaData. Retrieved May 1, 2024, from https://dataframes.juliadata.org/stable/
Julia Package: DataFrames.jl
Julia Package: CSV.jl
Julia Data Science: DataFrames.jl
Introducing Julia Wikibook: DataFrames
In computer programming, a collection is a grouping of some variable number of data items (possibly zero) that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion. [1]
This page provides syntax for different types of collections and data structures in Julia (arrays, sets, dictionaries, etc.). Each section includes an example to demonstrate the described methods.
Arrays are ordered collection of elements. In Julia
they are automatically indexed (consecutively numbered) by an integer starting with 1.
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Input:
Output:
Sets are an unordered collection of unique elements.
Input:
Output:
Dictionaries are unordered collection of key-value pairs where the key serves as the index (“associative collection”). Similar to elements of a set, keys are always unique.
Input:
Output:
Wikipedia contributors (n.d.). Collection. In Wikipedia. Retrieved May 1, 2024, from https://en.wikipedia.org/wiki/Collection_(abstract_data_type)
Julia Documentation: Base - Collections and Data Structures
Think Julia: Chapter 10 - Arrays
Think Julia: Chapter 11 - Dictionaries
Think Julia: Chapter 12 - Tuples
Instructions for installing Python on macOS and Windows operating systems can be found .
For most users, it is recommended to download the current stable release from .
Some developers might wish to use a different version, or to switch between versions. For this, the can be useful.
Python is also available for use in Brown's :
Oscar (for high-performance computing)
Stronghold (for secure computing)
The following instructions have been tested on computers running macOS 16 Big Ventura. In order to check the macOS version running on your computer, click on the "apple" icon in the top left hand corner of your screen and select "About This Mac." A window will pop up that includes a version number. Confirm you are running at least Version 16.X (where 'X' is any number). These instructions will likely work with earlier versions of macOS as well. If you are not running macOS 11.X Big Sur, you can upgrade for free following the instructions provided on .
Download Python
Navigate to and download the most recent version of Python for macOS.
Install Python
Open the downloaded file (e.g., python-3.12.3-macos11.pkg). A window will pop up with installation instructions. Progress through the prompts until Python has been installed in your Applications folder. Next, double click on the Python folder shortcut in your Applications folder to open it.
Run Python
Open, Terminal, type python3
, and hit return. Python should open. To quit Python, type quit()
and hit return.
Troubleshooting
If you get a Permission denied
error, rerun the command prepended with sudo
. You will be prompted to enter your computer password.
The following instructions have been tested on computers running Windows 10. Confirm that you are running at least Windows 10. These instructions will likely work with earlier versions of Windows, however they have not been tested.
Download Python
Install Python
Open the downloaded file (e.g., python-3.10.10-amd64.exe). A window will pop up with installation instructions. Progress through the prompts until Python has been installed on your device. When prompted with Advanced Options, make sure to check "Add Python to environment variables".
Run Python
Open Command Prompt, type py
, and hit enter. Python should open to quit Python, type quit()
and hit return.
This page provides syntax for different data types in Python as well as some of their associated functions. Each section includes an example to demonstrate the described syntax or function.
A string is a sequence of one or more characters (index values start at 0)
Action | Function |
---|
Input:
Output:
Python comes with a full-featured interactive command-line REPL (read-eval-print loop) built into the
python
executable. In addition to allowing quick and easy evaluation of Python statements, it has a searchable history, tab-completion, many helpful keybindings, and dedicated help?
and shell modes;
.
This page provides examples of using REPL on the command line
Type python
in terminal to launch REPL
Type "help
" to enter help pages within REPL
Type a function from Python to read help pages (ex:print
)
Press q
to quit
This is the typical first program for those new to a general purpose programming language like Python. It can be used to test that the of Python is working and also introduce Python's basic syntax using the environment or running code written using a at the command line.
Input:
Output:
Here are variations of the "Hello, World!" programming using variables and different print statements.
Input:
Output:
In order to assign variables in Python, you write the desired name for your variable, an “=” sign, and what the value of the variable should be.
Input:
Output:
We can write comments on our code, which do not run, to describe what certain lines of code or section of code do
These comments are just for the programmer, they will not appear anywhere in the output and just are there to explain what the code is doing or to provide helpful notes
To make a comment in Python, you can use the “#” symbol and then type your comment
Sometimes you might want to write longer comments that span multiple lines – to do this you can surround these comments with three tick marks above the start as well as three tick marks below the end
Input:
Output:
Without using a print statement, Python will only print out the most recent item that has an output. In order to print multiple things, we can use the print() function
Input:
Output:
Python is very sensitive with its indentation notation. Indentation should only be used in hierarchical structures, such as a class, function, or loop. Indents in improper locations will cause an error
Input:
Output:
Use Python in Brown Oscar Computing Environment - Forthcoming!
Use Python in Brown Stronghold Computing Environment - Forthcoming!
This page provides syntax for using numbers and mathematic operations in Python. Each section includes an example to demonstrate the described syntax and operations.
Integer (positive and negative counting number) - e.g., -3, -2, -1, 0, 1, 2, and 3:
int
- holds signed integers of non-limited length
long
- holds long integers (exists in Python 2.X, depreciated in Python 3.X)
Float (real or floating point numbers) - e.g., -2.14, 0.0, and 3.777
float
Boolean: (0 = False and 1 = True)
bool
Use type()
function to determine type
Input:
Output:
Input:
Output:
Input:
Output:
Create a Health Calculator Using Python - Forthcoming!
Regular expressions are powerful tools for pattern matching and text processing. They are represented as a pattern that consists of a special set of characters to search for in a string
str
. The regex module needs to be imported before use.
This page provides syntax for regular expressions in Python . Each section includes an example to demonstrate the described methods.
Action | Function |
---|
Character class specifies a list of characters to match ([...]
where ...
represents the list) or not match ([^...]
)
Anchors are special characters that can be used to match a pattern at a specified position
Repetition or quantifier characters specify the number of times to match a particular character or set of characters
Input:
Output:
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Navigate to and download the most recent version of Python for Windows (32-bit or 64-bit depending on the specifications of your device).
W3 Schools:
Real Python:
Real Python:
W3 Schools:
Operator | Example |
---|
Operator | Example |
---|
W3 Schools:
W3 Schools:
W3 Schools:
Anchor | Special Character |
---|
Repetition | Character |
---|
W3 Schools:
New array (empty)
[]
Specify type (integer)
Int64[]
Specify type (string)
String[]
Array with values
[1, 2, 3, 4, 5]
Array with values
["a1", "b2", "c3"]
Array of numbers
collect(1:10)
Split string str
by delimiter into words (e.g., space)
split(str, " ")
Get length of array my_array
length(my_array)
Get first element of array my_array
my_array[1]
Get last element of array my_array
my_array[end]
Get n element of array my_array (e.g., 2)
my_array[2]
Check if element is in array
in(str, my_array)
Add element to end
push!(my_array, str)
Remove element from end
pop!(my_array)
Remove element from beginning
popfirst!(my_array)
Add element to beginning
pushfirst!(my_array, str)
Sort array (will not change array itself)
sort(my_array)
Sort array in place (will change array)
sort!(my_array)
Get unique elements in array
unique(my_array)
Intersection
intersect(my_array, your_array)
Union
union(my_array, you_array)
Convert array to string
join(collect(my_array), str)
New set (empty)
Set[]
Specify type
Set{Int64}
Set with values
Set([1, 2, 3, 4, 5])
Set with values
Set(["a1", "b2", "c3", "b2"])
Get length of set my_set
length(my_set)
Check if value is in set
in(str, my_set)
Add value
push!(my_set, str)
Intersection
intersect(my_set, your_set)
Union
union(my_set, your_set)
Difference
setdiff(my_set, your_set)
New dictionary (empty)
Dict[]
Specify type
Dict{String, Int64}
Dictionary with values
Dict("one" => 1 , "two" => 2, "three" => 3, "four" => 4)
Get value for key in dictionary my_dict
my_dict["one"]
Check if dictionary has key
haskey(my_dict, "one")
Check for key/value pair
in(("one" => 1), my_dict)
Get value and set default
get!(my_dict, "one", 5)<br>get!(my_dict, "five", 5)
Add key/value pair
my_dict["five"] = 5
Delete key/value pair
delete!(my_dict, "four")
Get keys
keys(my_dict)
Get values
values(dict)
Convert keys to array
collect(keys(my_dict))
Convert values to array
collect(values(my_dict))
Sorting keys
sort(collect(keys(my_dict)))
Sorting values
sort(collect(values(my_dict)))
Sort by value (descending) with keys
sort(collect(zip(values(my_dict), keys(my_dict))), rev=true)
Sort by value (ascending) with keys
sort(collect(zip(values(my_dict), keys(my_dict))), rev=false)
Get top n by value (e.g., 3)
sort(collect(zip(values(my_dict), keys(my_dict))), rev=true)[1:3]
Addition | x + y |
Subtraction | x - y |
Multiplication | x * y |
Division | x / y |
Floor Division | x//y |
Power (Exponent) | x ** y |
Remainder (Modulo) | x % y |
Equality | x == y or isequal(x, y) |
Inequality | x != y or !isequal (x, y) |
Less than | x < y |
Less than or equal to | x <= y |
Greater than | x > y |
Greater than or equal to | x >= y |
Check if regex matches a string |
|
Capture regex matches |
|
Specify alternative regex |
|
Character Class |
|
Any lowercase vowel |
|
Any digit |
|
Any lowercase letter |
|
Any uppercase letter |
|
Any digit, lowercase letter, or uppercase letter |
|
Anything except a lowercase vowel |
|
Anything except a digit |
|
Anything except a space |
|
Any character |
|
Any word character (equivalent to |
|
Any non-word character (equivalent to |
|
A digit character (equivalent to |
|
Any non-digit character (equivalent to |
|
Any whitespace character (equivalent to |
|
Any non-whitespace character (equivalent to |
|
Beginning of line |
|
End of line |
|
Beginning of string |
|
End of string |
|
Zero or more times |
|
One or more times |
|
Zero or one time |
|
Exactly n times |
|
n or more times |
|
m or less times |
|
At least n and at most m times |
|
get word length |
|
extract nth character from word |
|
extract substring nth-mth character from word |
|
search for character in word |
|
search for subword in word |
|
remove white spaces from the end of a word |
|
remove last character from word |
|
determine data structure type |
|
In computer programming, a package is a collection of modules or programs that are often published as tools for a range of common use cases, such as text processing and doing math. Programmers can install these packages and take advantage of their functionality within their own code.
This page provides instructions for installing, using, and troubleshooting packages in Python.
For most users, it is recommended to download the current stable release from https://cloud.r-project.org/.
Some developers might wish to use a different version, or to switch between versions. For this, the rvenv package can be useful.
R is also available for use in Brown's Computing Environments:
Oscar (for high-performance computing)
Stronghold (for secure computing)
Download and install the latest version of The R Project for Statistical computing for macOS here.
For an integrated development environment (IDE) / graphical interface, you can also download and install R Studio from here.
R is one of the many languages used by the data science community to perform data manipulation, statistical modeling and machine learning. R was designed by statisticians for statistical computing.
Many Python programs involve the input and output of files. When analyzing a dataset, that dataset file will need to be pulled into your program (input). If you want to see the results of your analysis, your program will need an output.
This section provides the syntax for inputting files (reading) and outputting results (writing) using base Python (i.e, no packages such as Pandas)
Tabulate and report counts for sex in Adult Data Set from the UC Irvine Machine Learning Repository.
Dataset (example lines from adult.data
)
Input (process_file.py
)
Output
Terminal
Analyze the MIMIC-IV Demo Files Using Julia - Forthcoming!
Analyze the SyntheticRI Demo Files Using Julia - Forthcoming
Tutorials Point: Python - Files I/O
Data Science Central: Python File Input/Output
In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. [1]
This page provides syntax for some of the common control flow methods in Python. Each section includes an example to demonstrate the described methods
Test if a specified expression is true or false
Short-circuit evaluation
Test if all of the conditions are true x and y
Test if any of the conditions are true x or y
Test if a condition is not true not z
Conditional evaluation
if
statement
if-else
if-elif-else
Ternary operator
true_value if
condition else
false_value
Input:
Output:
Repeat a block of code a specified number of times or until some condition is met
while
loop
for
loop
Use break
to terminate loop
Input:
Output:
Input:
Output:
Python Documentation: Control Flow
Python Wiki: For Loops
W3 Schools: Python For Loops
W3 Schools: Python Conditionals and If Statements
In computer programming, a collection is a grouping of some variable number of data items (possibly zero) that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion. [1]
This page provides syntax for different types of collections and data structures in Python (arrays, sets, dictionaries, etc.). Each section includes an example to demonstrate the described methods
Arrays are ordered collections of elements. In Python they are automatically indexed (consecutively numbered) by an integer starting with 0.
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Input:
Output:
Sets are an unordered collection of unique elements.
Input:
Output:
Dictionaries are unordered collection of key-value pairs where the key serves as the index (“associative collection”). Similar to elements of a set, keys are always unique.
Input:
Output:
Wikipedia contributors (n.d.). Collection. In Wikipedia. Retrieved May 1, 2024, from https://en.wikipedia.org/wiki/Collection_(abstract_data_type)
W3 Schools: Python Data Structures
Data Quest: Python Data Structures
R comes with a full-featured interactive command-line REPL (read-eval-print loop) built into the
R
executable. In addition to allowing quick and easy evaluation of R statements, it has a searchable history, tab-completion, many helpful keybindings, and dedicated help?
and shell modes;
.
This page provides examples of using REPL on the command line.
Type "module load r" in terminal to load the R module, then on a new line type "R" to launch R
In terminal, q() quits the R module
Type "?" or help(function) to enter help pages within R's REPL
For example, to ask for help with linear functions in R, use help(lm) (output shown below)
This is the typical first program for those new to a programming language. It can be used to test that the of R is working and also introduce R's basic syntax using the environment or running code written using a at the command line.
Operator | Description | Example |
---|
Type | Example |
---|
Unlike other languages, R does not require the use of print statements to output code, but it does allow them. To print, you can simply write code, or include the code you want to be printed in a print() statement.
We can write comments on our code, which do not run, to describe what certain lines of code or section of code do. These comments are just for the programmer- they will not appear anywhere in the output and simply explain what the code is doing or provide helpful notes.
To comment in R, use the “#” symbol and type your comment on the same line
R has no syntax for multi-line comments, so each line that is commented out needs a "#" symbol at the beginning
Used to test if a specific case is true or false
Short-circuit evaluation:
Test if all conditions are true
Test if any conditions are true
Test if a condition is not true
If statement: run code if this statement is true
Only used at the beginning of a conditional statement
Else if statement: if previous statements aren't true, try this
Can be used an unlimited number of times in an if statement
Else statement: catch-all for anything outside of prior statements
Only used to end a conditional statement
Repeats a block of code a specified number of times or until some condition is met
While loop
For loop
Use break to terminate loop
Operator | Example |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
R for Data Science:
DataCamp:
R Documentation:
R Documentation:
Operator | Description |
---|
R Documentation:
R Documentation:
R Documentation:
R Documentation:
Equality
x == y
Inequality
x != y
Less than
x < y
Less than or equal to
x <= y
Greater than
x > y
Greater than or equal to
x >= y
New array (empty)
[]
Array with values (integers)
[1, 2, 3, 4, 5]
Array with values (string)
[“a1”, “ab2”, “c3”]
Array of numbers
list(range(1, 11))
Split string str by delimiter into words (e.g., space)
str.split(“ “)
Get length of array my_array
len(my_array)
Get first element of array my_array
my_array[0]
Get last element of array my_array
my_array[-1]
Get nth element of array my_array
(e.g., 2)
my_array[1]
Check if element is in array
str in my_array
Add element to end
my_array.append(str)
Remove element from end
my_array.pop()
Remove element from beginning
my_array.pop(0)
Add element to beginning
my_array.insert(0, str)
Sort array (will not change array itself)
sorted(my_array)
Sort array in place (will change array)
my_array.sort()
Get unique elements in array
list(set(my_array))
Intersection
set(my_array).intersection(your_array)
Union
set(my_array).union(your_array)
New set (empty)
[]
Set with values
my_set = {1, 2, 3, 4, 5}
Set with values
my_set = {"a1", "b2", "c3"}
Get length of set my_set
len(my_set)
Check if value is in set
"str" in my_set
Add value
my_set.add("str")
Intersection
my_set.intersection(your_set)
Union
my_set.union(your_set)
Difference
my_set.difference(your_set)
New Dictionary (empty)
{}
Dictionary with values
{"one": 1, "two": 2, "three": 3, "four": 4}
Get value for key in dictionary my_dict
my_dict["one"]
Check if dictionary has key
"one" in my_dict
Check for key/value pair
("one", 1) in my_dict.items()
Get value and set default
my_dict.get("one", 5)
my_dict.setdefault("five", 5)
Add key/value pair
my_dict["five"] = 5
Delete key/value pair
my_dict.pop("four", None)
Get keys
my_dict.keys()
Get values
my_dict.values()
Convert keys to array
list(my_dict.keys())
Convert values to array
list(my_dict.values())
Sorting keys
sorted(my_dict.keys())
Sorting values
sorted(my_dict.values())
Sort by value (descending) with keys
sorted(my_dict.items(), key=lambda x: x[1], reverse=True)
Sort by value (ascending) with keys
sorted(my_dict.items(), key=lambda x: x[1])
Get top n by value (e.g., 3)
sorted(my_dict.items(), key=lambda x: x[1], reverse=True)[:3]
> | Greater than |
< | Less than |
>= | Greater than or equal |
<= | Less than or equal |
== | Exactly equal |
!= | Not equal to |
& | Entry wise and |
Get string length | nchar(string) |
Combine two strings | str_c(string1, string2) |
Sort values within a string | sort(string1, string2, string3) |
Search for a substring within a string | grep(substring/value, string) |
Replace a single value within a string | sub(pattern, replacement, string) |
Replace all instances within a string | gsub(pattern, replacement, string) |
Find matches for exact string | grepl(pattern, string) |
<- or = or <<- | Left Assignment | x <- 7, x = 7, x <<- 7 |
-> or ->> | Right Assignment | x -> 7, x ->> 7 |
Logical | TRUE, FALSE |
Numeric | 1, 55, 999 |
Integer | 1L, 32L, 0L |
Complex | 2 + 3i |
Character | "great", "23.4" |
Addition | + |
Subtraction | - |
Multiplication | * |
Division | / |
Power (Exponent) | ^ or ** |
Remainder (Modulo) | %% |
Negation (for Bool) | !x |
When coding in R, you will often need to input datasets to work with! The easiest ways to do so are either from a .csv file or a .txt file. To do this, you can use the read.csv() and read_table() functions, respectively. The following demonstrates these functions using a hypothetical "hospital_data" dataset.
To output a file from R, use the syntax sink("FileName.FileType").
R Documentation: read.csv file input
More read.csv resources here
R Documentation: read_table file input
R Documentation: File output
In computer programming, a package is a collection of modules or programs that are often published as tools for a range of common use cases, such as text processing and doing math. Programmers can install these packages and take advantage of their functionality within their own code.
This page includes instructions for installing packages in R and a description of some of R's most frequently used packages.
To install a package in R, you can either:
Use the install.packages("PackageName") function if you have the package downloaded locally on your machine
Or if you are using RStudio, you can use Tools > Install packages, enter in the package name and click Install
Once you install the package, you have to load it into your library using the libary(PackageName) function.
In R, tidyverse is one of the most popular packages, as it contains an assortment of packages used for data science, such as:
ggplot2, used to create graphics and data visualization
dplyr, contains functions used for data manipulation, like mutate() and filter()
tidyr, used for data organization and cleaning
tibble, an optimized dataframe visualizer
readxl, can be used to input Excel files in .xlsx format into R
R Documentation: Packages
Lists in R are ordered collections of data that can be of different classes.
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
Action | Syntax |
---|---|
New list (empty)
listname <- list()
New list (misc)
listname <- list(1L, "abc", 10.3)
Access an element
list[position]
Change a value
list[position] <- newvalue
See number of values in a list
length(list)
See if item is present in a list
item %in% list
Add item to a list
append(list)
Add item to a list at a specific position
append(list, after=index number)
Remove item from list
newlist <- list[-index number]
New matrix (empty)
matrixname <- matrix()
New matrix (numbers)
matrixname <- matrix(data, nrow=, ncol=)
New matrix (strings)
matrixname <- matrix(data, nrow=, ncol=)
Access a matrix element
matrix[row position, column position]
Access an entire row
matrix[row position,]
Access an entire column
matrix[,column position]
Create an additional row
rbind(matrix, values for new row)
Create an additional column
cbind(matrix, values for new column)
New array (empty)
arrayname <- array()
New array (numbers)
arrayname <- array(data, dim(nrow=, ncol=, ndim=)
New array (strings
arrayname <- array(data, dim(nrow=, ncol=, ndim=)
Access an array element
array[row position, column position, dimension]
Check if an item exists
value %in% array
Sort array increasing
sort(array)
Sort array decreasing
sort(array, decreasing = TRUE)