To group by multiple columns and then find the count of rows in a pandas DataFrame, you can use the groupby() and count() functions.
import pandas as pd
df = pd.DataFrame({"animal_type":["dog","cat","dog","cat","dog","dog","cat","cat","dog"], "gender":["F","F","F","F","M","M","M","F","M"]})
print(df)
print(df.groupby(["animal_type","gender"])["gender"].count().rename('count').reset_index())
#Output:
animal_type gender
0 dog F
1 cat F
2 dog F
3 cat F
4 dog M
5 dog M
6 cat M
7 cat F
8 dog M
animal_type gender count
0 cat F 3
1 cat M 1
2 dog F 2
3 dog M 3
When working with data, it is very useful to be able to group and aggregate data by multiple columns to understand the various segments of our data.
One such case is if you want to group your data and get the row counts of the groups.
To get the counts by group of columns in a pandas DataFrame, you can use the groupby() and count() functions.
Below is a simple example showing you how you can group by and then count the rows in each group in a pandas DataFrame in Python.
In the example below, I’ve renamed the count of rows to count and then reset the index so that we can work with the resulting DataFrame easier.
import pandas as pd
df = pd.DataFrame({"animal_type":["dog","cat","dog","cat","dog","dog","cat","cat","dog"], "gender":["F","F","F","F","M","M","M","F","M"]})
print(df)
print(df.groupby(["animal_type","gender"])["gender"].count().rename('count').reset_index())
#Output:
animal_type gender
0 dog F
1 cat F
2 dog F
3 cat F
4 dog M
5 dog M
6 cat M
7 cat F
8 dog M
animal_type gender count
0 cat F 3
1 cat M 1
2 dog F 2
3 dog M 3
Using groupby() and count() on Single Column in pandas DataFrame
You can use groupby() to group a pandas DataFrame by one column or multiple columns.
If you want to group a pandas DataFrame by one column and then the counts of each value in that group with count(), you can do the following.
import pandas as pd
df = pd.DataFrame({"animal_type":["dog","cat","dog","cat","dog","dog","cat","cat","dog"], "gender":["F","F","F","F","M","M","M","F","M"]})
print(df)
print(df.groupby(["animal_type"])["animal_type"].count().rename('count').reset_index())
#Output:
animal_type gender
0 dog F
1 cat F
2 dog F
3 cat F
4 dog M
5 dog M
6 cat M
7 cat F
8 dog M
animal_type count
0 cat 4
1 dog 5
Using groupby() to Group By Multiple Columns and count() in pandas DataFrame
If you want to group a pandas DataFrame by multiple columns and then the counts of each value in each group with count(), you can do the following.
import pandas as pd
df = pd.DataFrame({"animal_type":["dog","cat","dog","cat","dog","dog","cat","cat","dog"], "gender":["F","F","F","F","M","M","M","F","M"]})
print(df)
print(df.groupby(["animal_type","gender"])["gender"].count().rename('count').reset_index())
#Output:
animal_type gender
0 dog F
1 cat F
2 dog F
3 cat F
4 dog M
5 dog M
6 cat M
7 cat F
8 dog M
animal_type gender count
0 cat F 3
1 cat M 1
2 dog F 2
3 dog M 3
Hopefully this article has been useful for you to learn how to group by and count in pandas with groupby() and count().