Pandas

Post image

Load data from CSV

import pandas as pd

# Load csv data
df = pd.read_csv('pokemon_data.csv')
# You can also load excel file
excel_data = pd.read_excel('pokemon_data.xlsx')

# Print all data
print(df)

You can print head or tail lines

df.head(30)
df.tail(30)

Dealing with big data

for df in pd.read_csv('pokemon_data.csv', chunksize=5):
	# Do the task
	pass

Reading data

df.columns

# Read specific columns
df[['Name', 'Type 1']]

# Read specific lines
df.iloc[2:30]

# Read a field
df.iloc[4, 2]

Iterating

for idx, row in df.iterrows():
	# Do something
	pass

Querying

# Get the general metrics
df.describe()

Filtering

# Filter rows
df.loc[df['Type 1'] == 'Fire']

# Consider the not annotation
df.loc[~df['Name'].str.contains('Mega')]
df.loc[df['Type 1'].str.contains('fire|grass', flags=re.I, regex=True)]

# Consider the and annotation
new_df = df.loc[(df['Type 1'] == 'Fire') & (df['Type 2'] != 'Fire')]
new_df = new_df.reset_index()

Sorting

df.sort_values(['Name', 'HP'], ascending=[1, 0])

Aggregate statistics

# Calculate the mean based on their type
df.groupby(['Type 1']).mean()

Mutate columns

# Add a column
df['Total'] = df['HP'] + df['Defense'] + df['Attack']
df['Total'] = df.iloc[:, 4:10].sum(axis=1)

# Delete a column
df = df.drop(columns=['Total'])

Mutate data

df.loc[df['Type 1'] == 'Fire', 'Type 1'] = 'Flamer'

Saving

df.to_csv('output.csv')
df.to_excel('output.xlsx')

Resources

You May Also Like

Amin Ghasvari
  • 24 Jun, 2022

Amin Ghasvari

Hi everybody! My name is Amin and welcome to this blog!