Pandas, the Python data manipulation library, is a powerful tool for working with data. Among its many features, the iloc
indexer is a valuable asset when it comes to selecting specific rows and columns from a DataFrame. In this blog post, we’ll dive into the world of Pandas iloc
, understand its syntax, and explore practical examples of how to use it for data selection.
What is iloc
?
iloc
is a Pandas indexer used for integer-location based indexing. It allows you to select rows and columns from a DataFrame by their integer positions, similar to how you would use row and column indices in a NumPy array.
The basic syntax for using iloc
is as follows:
df.iloc[row_indices, column_indices]
row_indices
can be:- A single integer for a specific row.
- A list of integers for multiple rows.
- A slice object for a range of rows.
- A boolean array for conditional selection.
column_indices
follows the same principles asrow_indices
but applies to columns.
Practical Examples
Let’s explore some practical examples of how to use iloc
for data selection.
Example 1: Selecting a Single Row
To select a single row by its integer position, you can use a single integer for row_indices
. For example:
import pandas as pd data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]} df = pd.DataFrame(data) # Select the second row selected_row = df.iloc[1] print(selected_row)
In this example, we select the second row of the DataFrame using iloc[1]
.
Example 2: Selecting Multiple Rows
To select multiple rows, you can pass a list of integers to row_indices
. For instance:
import pandas as pd data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]} df = pd.DataFrame(data) # Select the first and third rows selected_rows = df.iloc[[0, 2]] print(selected_rows)
Here, we select the first and third rows by passing a list [0, 2]
to iloc
.
Example 3: Selecting Specific Rows and Columns
You can also select specific rows and columns simultaneously. Use a list of integers for row_indices
and a list of integers for column_indices
. For example:
import pandas as pd data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]} df = pd.DataFrame(data) # Select the first and third rows and the second column selected_data = df.iloc[[0, 2], [1]] print(selected_data)
In this case, we select the first and third rows and the second column, resulting in a DataFrame with one cell.
Example 4: Using Slices
You can use slice objects for more extensive selections. To select a range of rows and columns, you can use slices for both row_indices
and column_indices
. Here’s an example:
import pandas as pd data = {'A': [1, 2, 3, 4, 5], 'B': [6, 7, 8, 9, 10], 'C': [11, 12, 13, 14, 15]} df = pd.DataFrame(data) # Select rows 1 to 3 and columns 0 to 1 selected_data = df.iloc[1:4, 0:2] print(selected_data)
In this example, we use slice objects to select rows 1 to 3 and columns 0 to 1.
Example 5: Conditional Selection
You can use boolean arrays to conditionally select rows and columns. For instance:
import pandas as pd data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]} df = pd.DataFrame(data) # Select rows where column 'A' is greater than 1 selected_rows = df.iloc[df['A'] > 1] print(selected_rows)
Here, we conditionally select rows where the values in column ‘A’ are greater than 1.
Conclusion
Pandas iloc
is a powerful tool for selecting specific rows and columns in a DataFrame based on their integer positions. It offers various ways to tailor your data selection to your needs, whether you want to access a single row, multiple rows, specific columns, or apply conditional selection. By mastering the use of iloc
, you can efficiently work with your data and perform advanced data analysis tasks with ease. Click here to learn more.