The SAS automatic variable _n_ represents the number of times the data step has iterated.
As an automatic variable, _n_ is created automatically by SAS when a data step is performed. _n_ is temporary, meaning it is not kept on the dataset after the data step is finished.
When a data step starts, _n_ is initialized to 1. Then, after each iteration through the data step, _n_ is incremented by one.
To see this, let’s say we have the following dataset of 6 observations.
data data;
input num;
datalines;
4
16
5
-2
3
-10
;
run;
Let’s put _n_ to the log as we enter a data step with our dataset. Below is the following output of _n_ as we iterate through our dataset.
data data;
set data;
put _n_;
run;
/* Output */
1
2
3
4
5
6
As you can see, _n_ starts at 1 and is incremented for each observation.
Adding the Row Number as a Column in SAS with _n_
When working with data in SAS, the ability to easily get information about rows or columns is valuable.
One piece of information which can be valuable to know is the number of the row you are currently operating on.
To get the row number in a SAS data step, you can use the SAS automatic variable _n_.
Let’s say you have the same dataset as above. Let’s get the row number for each of these records.
Below is an example of how you can use _n_ to get the row number in SAS.
data data_with_row_number;
set data;
row_num = _n_;
run;
The resulting dataset is as follows.
num row_num
1 4 1
2 16 2
3 5 3
4 -2 4
5 3 5
6 -10 6
Useful Applications of _n_ in SAS Data Steps
The SAS _n_ variable can be useful in certain cases.
For example, if you want to perform a certain calculation on a particular record, then you can use an if statement and check if _n_ is equal to the observation number of that record.
data want;
set have;
if _n_ = 3 then do;
/* do stuff here */
end;
run;
This can be useful if you want to check if you are operating on the first or last observation of a dataset.
data want;
set have;
if _n_ = 1 then do;
/* do stuff related to being on first observation */
end;
if _n_ = last then do;
/* do stuff related to being on last observation */
end;
run;
Hopefully this article has been useful for you to learn how to use _n_ in your SAS data steps.