Data Frames and Data Manipulation

This page provides examples of using the pandas package in Python, demonstrating the syntax and common functions within the package.

Example

Install and Load Pandas

# Load the pandas package
import pandas as pd

Create Dataframe

# Import pandas
import pandas as pd

# Create data as key-value pairs
data = {'id': [1,2,3,4,5],
        'gender': ["F", "M", "F", "M", "F"],
        'age': [68, 54, 49, 28, 36]}
        
# Put the data into a data frame
df = pd.DataFrame(data)

Display Dataframe

Input:

Output:

First two lines of dataframe:

Input:

Output:

Last two lines of dataframe:

Input:

Output:

Describe Dataframe

Dataframe size:

Input:

Output:

Dataframe column names:

Input:

Output:

Dataframe description:

Input:

Output:

Accessing DataFrames

Get "age" column (different ways to call the column)

Input:

Output:

Get row

Input:

Output:

Get element

Input:

Output:

Get subset (specific rows and all columns)

Input:

Output:

Get subset (all rows and specific columns)

Input:

Output:

Get subset (all rows meeting specified criteria - numbers)

Input:

Output:

Get subset (all rows meeting specified criteria - strings)

Input:

Output:

Get subset (all rows meeting specified criteria)

Input:

Output:

Add Column

New columns with specified values

Input:

Output:

New column with calculated value

Input:

Output:

Get counts/frequency

Input:

Output:

Transform DataFrame

sort

Input:

Output:

stack (reshape from wide to long format)

Input:

Output:

unstack (reshape from long to wide format)

Input:

Output:

Traversing DataFrame (for loops)

sort

Input:

Output:

Exercises

  • Analyzing Health Datasets with Pandas in Python- Forthcoming!

Resources

Last updated