DataFrames

DataFrames.jl is a Julia package that provides a set of tools for working with tabular data in Julia. Its design and functionality are similar to those of pandasarrow-up-right (in Python) and data.frame, data.tablearrow-up-right and dplyrarrow-up-right (in R), making it a great general purpose data science tool. [1]

This page provides examples of using DataFrames.jl, demonstrating the syntax and common functions within the package.

Example

Install and Load DataFrames.jl Package

using Pkg

# Add DataFrames package
Pkg.add("DataFrames")

# Load paackages
using DataFrames

Create Dataframe

# Create dataframe
df = DataFrame(id = 1:5, gender = ["F", "M", "F", "M", "F"], age = [68, 54, 49, 28, 36])

Display Dataframe

Input:

# display dataframe
println(df)

Output:

First two lines of dataframe:

Input:

Output:

Last two lines of dataframe:

Input:

Output:

Describe Dataframe

Dataframe size:

Input:

Output:

Dataframe column names:

Input:

Output:

Dataframe description:

Input:

Output:

Accessing DataFrames

Get "age" column (different ways to call the column)

Input:

Output:

Get row

Input:

Output:

Get element

Input:

Output:

Get subset (specific rows and all columns)

Input:

Output:

Get subset (all rows and specific columns)

Input:

Output:

Get subset (all rows meeting specified criteria - numbers)

Input:

Output:

Get subset (all rows meeting specified criteria - strings)

Input:

Output:

Get subset (all rows meeting specified criteria)

Input:

Output:

Add Column

New columns with specified values

Input:

Output:

New column with calculated value

Input:

Output:

Get counts/frequency

Input:

Output:

Transform DataFrame

sort

Input:

Output:

stack (reshape from wide to long format)

Input:

Output:

unstack (reshape from long to wide format)

Input:

Output:

Traversing DataFrame (for loops)

sort

Input:

Output:

Exercises

  • Analyzing Health Datasets with DataFrames in Julia - Forthcoming!

References

  1. JuliaData Contributors. (n.d.). DataFrames.jl - JuliaData. Retrieved May 1, 2024, from https://dataframes.juliadata.org/stable/arrow-up-right

Resources

Last updated