Handling missing values using Pandas & Numpy | Python Programming
Prerequisite: Handling missing values in the dataset (Theory)
In this tutorial, we are going to learn how to check for null values using Pandas and NumPy. How to drop all the null values from the dataset and How to fill the null values in the dataset with an appropriate value.
Let us consider that we have a dataset with missing values. We can create a similar dataset in pandas as given below.
We have created a dataset of 4 colors out of which one is missing.
How to check what values in the dataset are missing?
We can use pandas.isnull() function to check what values in the dataset are missing.
“None” in python is also considered as a missing value in the dataset.
How to drop all the null values from the dataset?
We can remove all the null values from the dataset using pandas.dropna() function.
dropna() drops by default any row containing a missing value.
We can even create and deal with more complex datasets -
How to drop the row only if all the attributes of the row are null?
We can do this by adding a parameter how=”all” in dropna. This will only drop rows where all the attributes are null.
How to fill Missing Values?
Rather than removing or dropping all the null values from the dataset, we can fill in null values with a particular value. We can do this by using pandas.fillna(value) function.