To find the smallest values in a Series or Dataframe column using pandas, the easiest way is to use the pandas nsmallest() function.
df.nsmallest(n,"column")
By default, The pandas nsmallest() function returns the first n smallest rows in the given columns in ascending order.
Finding the smallest values of a column or Series using pandas is easy. We can use the pandas nsmallest() function to find the smallest values of a column or numbers.
Let’s say we have the following DataFrame.
df = pd.DataFrame({'Name': ['Jim', 'Sally', 'Bob', 'Sue', 'Jill', 'Larry'],
'Weight': [160.20, 123.81, 209.45, 150.35, 102.43, 187.52]})
print(df)
# Output:
Name Weight
0 Jim 160.20
1 Sally 123.81
2 Bob 209.45
3 Sue 150.35
4 Jill 102.43
5 Larry 187.52
To get the 2 smallest values of the numbers in the column “Weight”, we can use the pandas nsmallest() function in the following Python code:
print(df.nsmallest(2,"Weight"))
# Output:
Name Weight
4 Jill 102.43
1 Sally 123.81
Please note, you can use the pandas nsmallest() function on a column or Series with numeric values. If we pass “Name” to nsmallest in our example, we will receive an error because the “Name” column is made up of strings.
If you want to find the n largest values, you can use the pandas nlargest() function.
Finding the N Smallest Values in a Column using pandas
The nsmallest() function has a few different options if there are rows with the same values in your Dataframe.
Let’s say our Dataframe from above has changed a little bit and we now have some values which occur multiple times in the column weight:
df = pd.DataFrame({'Name': ['Jim', 'Sally', 'Bob', 'Sue', 'Jill', 'Larry'],
'Weight': [160.20, 160.20, 209.45, 150.35, 187.52, 187.52]})
print(df)
# Output:
Name Weight
0 Jim 160.20
1 Sally 160.20
2 Bob 209.45
3 Sue 150.35
4 Jill 187.52
5 Larry 187.52
By default, the pandas nsmallest() function returns the first occurrence of the nth smallest value.
print(df.nsmallest(2,"Weight"))
# Output:
Name Weight
3 Sue 150.35
0 Jim 160.20
In this case, since Jim came before Sally, Jim’s row is returned.
If we want to return the last occurrence, we can pass keep=’last’ to nsmallest():
print(df.nsmallest(2,"Weight", keep='last'))
# Output:
Name Weight
3 Sue 150.35
1 Sally 160.20
If we want to keep all rows which contain values in the nth smallest values, we can pass keep=’all’ to nsmallest().
print(df.nsmallest(2,"Weight", keep='all'))
# Output:
Name Weight
3 Sue 150.35
0 Jim 160.20
1 Sally 160.20
Find the n Smallest values over Multiple Columns in Dataframe
We can also use the pandas nsmallest() function to find the n smallest values over multiple columns. We just need to pass multiple column names to the function.
Let’s say we have another column on the DataFrame from above:
df = pd.DataFrame({'Name': ['Jim', 'Sally', 'Bob', 'Sue', 'Jill', 'Larry'],
'Weight': [160.20, 160.20, 209.45, 150.35, 187.52, 187.52],
'Height': [50.10, 68.94, 71.42, 48.56, 59.37, 63.42] })
print(df)
# Output:
Name Weight Height
0 Jim 160.20 50.10
1 Sally 160.20 68.94
2 Bob 209.45 71.42
3 Sue 150.35 48.56
4 Jill 187.52 59.37
5 Larry 187.52 63.42
To get the smallest values for both the “Weight” and “Height” columns, we just need to pass both column names in a list like in the following Python code.
print(df.nsmallest(3,["Weight","Height"]))
# Output:
Name Weight Height
3 Sue 150.35 48.56
0 Jim 160.20 50.10
1 Sally 160.20 68.94
This will order the smallest values by the first column, then the second column specified, and so on.
Hopefully this article has been helpful for you to understand how to find the smallest values in a Series or DataFrame using pandas.