Handling missing data in pandas data frame python

In this post we are going to discuss how to handle missing data from a pandas data frame.

Find total number of missing data in the data frame

 missing_total = df.isnull().sum().sum() 

Find number of missing data in each column in a data frame

missing_per_column = df.isnull().sum()
Investigate patterns in the amount of missing data in each column.
import matplotlib.pyplot as plt
plt.hist( missing_per_column ,bins=15)
plt.show()
Percentage of missing data per column in pandas data frame
percentage_missing_columns =(df.isnull().sum()/len(df))*100

Leave a Reply

%d bloggers like this: