Pandas

Amin Ghasvari
|
24 June 2022

Load data from CSV

import pandas as pd

# Load csv data
df = pd.read_csv('pokemon_data.csv')
# You can also load excel file
excel_data = pd.read_excel('pokemon_data.xlsx')

# Print all data
print(df)

You can print head or tail lines

df.head(30)
df.tail(30)

Dealing with big data

for df in pd.read_csv('pokemon_data.csv', chunksize=5):
	# Do the task
	pass

Reading data

df.columns

# Read specific columns
df[['Name', 'Type 1']]

# Read specific lines
df.iloc[2:30]

# Read a field
df.iloc[4, 2]

Iterating

for idx, row in df.iterrows():
	# Do something
	pass

Querying

# Get the general metrics
df.describe()

Filtering

# Filter rows
df.loc[df['Type 1'] == 'Fire']

# Consider the not annotation
df.loc[~df['Name'].str.contains('Mega')]
df.loc[df['Type 1'].str.contains('fire|grass', flags=re.I, regex=True)]

# Consider the and annotation
new_df = df.loc[(df['Type 1'] == 'Fire') & (df['Type 2'] != 'Fire')]
new_df = new_df.reset_index()

Sorting

df.sort_values(['Name', 'HP'], ascending=[1, 0])

Aggregate statistics

# Calculate the mean based on their type
df.groupby(['Type 1']).mean()

Mutate columns

# Add a column
df['Total'] = df['HP'] + df['Defense'] + df['Attack']
df['Total'] = df.iloc[:, 4:10].sum(axis=1)

# Delete a column
df = df.drop(columns=['Total'])

Mutate data

df.loc[df['Type 1'] == 'Fire', 'Type 1'] = 'Flamer'

Saving

df.to_csv('output.csv')
df.to_excel('output.xlsx')

Resources

Tutorial - Github Repo

Written By

Amin Ghasvari

Hi everybody! My name is Amin and welcome to this blog!

Amin Ghasvari

Hi everybody! My name is Amin and welcome to this blog!

24 Sep, 2018

Pandas

Load data from CSV

Dealing with big data

Reading data

Iterating

Querying

Filtering

Sorting

Aggregate statistics

Mutate columns

Mutate data

Saving

Resources

Amin Ghasvari

You May Also Like

Amin Ghasvari

Search Result