To replace NaN in a dataframe, the simplest way is to use the pandas fillna() function.
You can replace NaN values on a single or multiple columns, or replace NaN values for the entire dataframe with both numbers and strings.
df = df.fillna(0) #replacing NaN values with 0 for the entire dataframe
df["col_name"] = df["col_name"].fillna("") #replacing NaN values with "" for the column "col_name"
df[["col1","col2"]] = df[["col1","col2"]].fillna("") #replacing NaN values with "" for the columns "col1" and "col2"
When working with data, missing values can make life as an analyst difficult. Luckily in the pandas package in Python, we have an easy way to deal with missing values.
Let’s say I have the following DataFrame of summarized data:
animal_type gender type variable level count sum mean std min 25% 50% 75% max
0 cat female numeric age N/A 5.0 18.0 3.60 1.516575 2.0 3.00 3.0 4.00 6.0
1 cat male numeric age N/A 2.0 3.0 1.50 0.707107 1.0 1.25 1.5 1.75 2.0
2 dog female numeric age N/A 2.0 8.0 4.00 0.000000 4.0 4.00 4.0 4.00 4.0
3 dog male numeric age N/A 4.0 15.0 3.75 1.892969 1.0 3.25 4.5 5.00 5.0
4 cat female numeric weight N/A 5.0 270.0 54.00 32.093613 10.0 40.00 50.0 80.00 90.0
5 cat male numeric weight N/A 2.0 110.0 55.00 63.639610 10.0 32.50 55.0 77.50 100.0
6 dog female numeric weight N/A 2.0 100.0 50.00 42.426407 20.0 35.00 50.0 65.00 80.0
7 dog male numeric weight N/A 4.0 180.0 45.00 23.804761 20.0 27.50 45.0 62.50 70.0
8 cat female categorical state FL 2.0 NaN NaN NaN NaN NaN NaN NaN NaN
9 cat female categorical state NY 1.0 NaN NaN NaN NaN NaN NaN NaN NaN
10 cat female categorical state TX 2.0 NaN NaN NaN NaN NaN NaN NaN NaN
11 cat male categorical state CA 1.0 NaN NaN NaN NaN NaN NaN NaN NaN
12 cat male categorical state TX 1.0 NaN NaN NaN NaN NaN NaN NaN NaN
13 dog female categorical state FL 1.0 NaN NaN NaN NaN NaN NaN NaN NaN
14 dog female categorical state TX 1.0 NaN NaN NaN NaN NaN NaN NaN NaN
15 dog male categorical state CA 1.0 NaN NaN NaN NaN NaN NaN NaN NaN
16 dog male categorical state FL 1.0 NaN NaN NaN NaN NaN NaN NaN NaN
17 dog male categorical state NY 2.0 NaN NaN NaN NaN NaN NaN NaN NaN
18 cat female categorical trained yes 5.0 NaN NaN NaN NaN NaN NaN NaN NaN
19 cat male categorical trained no 2.0 NaN NaN NaN NaN NaN NaN NaN NaN
20 dog female categorical trained no 1.0 NaN NaN NaN NaN NaN NaN NaN NaN
21 dog female categorical trained yes 1.0 NaN NaN NaN NaN NaN NaN NaN NaN
22 dog male categorical trained no 4.0 NaN NaN NaN NaN NaN NaN NaN NaN
In this dataframe, we have a lot of NaN values.
To replace NaN values, we can use the pandas fillna() function to accomplish this.
The fillna() function takes both numeric and string inputs. If you want to replace NaN values in just the “sum” column with 0, you can do the following:
df["sum"] = df["sum"].fillna(0) #replacing NaN values with 0 for the column "sum"
To replace the NaN values in multiple columns using pandas, the Python code below will allow you can do this:
df[["sum","mean"]] = df[["sum","mean"]].fillna(0) #replacing NaN values with 0 for the column "sum"
If you want to replace NaN values in the entire dataframe with 0, then you can do the following:
df = df.fillna(0) #replacing NaN values with 0 for the entire dataframe
The resulting dataframe is as follows:
df.fillna(0, inplace=True)
#output:
animal_type gender type variable level count mean sum std min 25% 50% 75% max
0 cat female numeric age N/A 5.0 3.60 18.0 1.516575 2.0 3.00 3.0 4.00 6.0
1 cat male numeric age N/A 2.0 1.50 3.0 0.707107 1.0 1.25 1.5 1.75 2.0
2 dog female numeric age N/A 2.0 4.00 8.0 0.000000 4.0 4.00 4.0 4.00 4.0
3 dog male numeric age N/A 4.0 3.75 15.0 1.892969 1.0 3.25 4.5 5.00 5.0
4 cat female numeric weight N/A 5.0 54.00 270.0 32.093613 10.0 40.00 50.0 80.00 90.0
5 cat male numeric weight N/A 2.0 55.00 110.0 63.639610 10.0 32.50 55.0 77.50 100.0
6 dog female numeric weight N/A 2.0 50.00 100.0 42.426407 20.0 35.00 50.0 65.00 80.0
7 dog male numeric weight N/A 4.0 45.00 180.0 23.804761 20.0 27.50 45.0 62.50 70.0
8 cat female categorical state FL 2.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
9 cat female categorical state NY 1.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
10 cat female categorical state TX 2.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
11 cat male categorical state CA 1.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
12 cat male categorical state TX 1.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
13 dog female categorical state FL 1.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
14 dog female categorical state TX 1.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
15 dog male categorical state CA 1.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
16 dog male categorical state FL 1.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
17 dog male categorical state NY 2.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
18 cat female categorical trained yes 5.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
19 cat male categorical trained no 2.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
20 dog female categorical trained no 1.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
21 dog female categorical trained yes 1.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
22 dog male categorical trained no 4.0 0.00 0.0 0.000000 0.0 0.00 0.0 0.00 0.0
Replacing NaN Values With String Using Pandas
Many times when we are using pandas dataframes for data analysis, we have both numerical and string data.
Let’s take the same dataframe from above.
Instead of filling the NaN values with 0, we can fill the NaNs in our dataframe with a string value.
If you want to replace NaN values in just the “sum” column with “NaN replaced”, you can do the following:
df["sum"] = df["sum"].fillna("NaN replaced") #replacing NaN values with "NaN replaced" for the column "sum"
To replace the NaN values in multiple columns, the Python code below will allow you can do this:
df[["sum","mean"]] = df[["sum","mean"]].fillna("NaN replaced") #replacing NaN values with "NaN replaced" for the columns "sum" and "mean"
If you want to replace NaN values in the entire dataframe with a string, then you can do the following:
df = df.fillna("NaN replaced") #replacing NaN values with "NaN replaced" for the entire dataframe
The resulting dataframe is as follows:
df.fillna("NaN replaced", inplace=True)
#output:
animal_type gender type variable level count mean sum std min 25% 50% 75% max
0 cat female numeric age N/A 5.0 3.6 18 1.51658 2 3 3 4 6
1 cat male numeric age N/A 2.0 1.5 3 0.707107 1 1.25 1.5 1.75 2
2 dog female numeric age N/A 2.0 4 8 0 4 4 4 4 4
3 dog male numeric age N/A 4.0 3.75 15 1.89297 1 3.25 4.5 5 5
4 cat female numeric weight N/A 5.0 54 270 32.0936 10 40 50 80 90
5 cat male numeric weight N/A 2.0 55 110 63.6396 10 32.5 55 77.5 100
6 dog female numeric weight N/A 2.0 50 100 42.4264 20 35 50 65 80
7 dog male numeric weight N/A 4.0 45 180 23.8048 20 27.5 45 62.5 70
8 cat female categorical state FL 2.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
9 cat female categorical state NY 1.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
10 cat female categorical state TX 2.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
11 cat male categorical state CA 1.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
12 cat male categorical state TX 1.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
13 dog female categorical state FL 1.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
14 dog female categorical state TX 1.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
15 dog male categorical state CA 1.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
16 dog male categorical state FL 1.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
17 dog male categorical state NY 2.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
18 cat female categorical trained yes 5.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
19 cat male categorical trained no 2.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
20 dog female categorical trained no 1.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
21 dog female categorical trained yes 1.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
22 dog male categorical trained no 4.0 NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced NaN Replaced
Using pandas replace() to Replace NaN in Pandas Dataframe
We can also use the pandas replace() function to replace NaN values in a pandas dataframe.
Using the pandas replace() function, we can replace NaN values with number and string values – just like with the pandas fillna() function.
Just like with fillna(), we can use the pandas replace() function to replace NaN in a single column in a pandas dataframe as shown below:
df["column_name"] = df["column_name"].replace(np.nan, 0)
We can use the pandas replace() function to replace NaN in an entire pandas DataFrame as shown below:
df = df.replace(np.nan, 0)
Hopefully this article has helped you learn how to replace NaN values using the pandas fillna() function in Python.