In this post we are going to discuss how to handle missing data from a pandas data frame.
Find total number of missing data in the data frame
missing_total = df.isnull().sum().sum()
Find number of missing data in each column in a data frame
missing_per_column = df.isnull().sum()
Investigate patterns in the amount of missing data in each column.
import matplotlib.pyplot as plt
plt.hist( missing_per_column ,bins=15)
plt.show()
Percentage of missing data per column in pandas data frame
percentage_missing_columns =(df.isnull().sum()/len(df))*100
