When filtering a SAS dataset, you can filter by multiple values with the in operator in a where statement.
data want;
set have;
where variable_a in (1, 2, 3);
run;
When working in SAS, the ability to easily be able to create complex filters and get the subsets we desire is valuable.
One such filter which we can create in SAS is filtering a dataset by multiple values.
By using the SAS in operator combined with a where statement, you can subset data based on multiple values.
Let’s say we have following data set which we create with the following data step:
data have;
input animal_type $ gender $ weight age state $ trained $;
datalines;
cat male 10 1 CA no
dog male 20 4 FL no
dog male 30 5 NY no
cat female 40 3 FL yes
cat female 10 2 NY yes
dog female 20 4 TX yes
cat female 50 6 TX yes
dog male 60 1 CA no
dog male 70 5 NY no
cat female 80 4 FL yes
cat female 90 3 TX yes
cat male 100 2 TX no
dog female 80 4 FL no
;
run;
Let’s say that we only want to keep records where the state is Texas (“TX”) or Florida (“FL”).
Below is a example of how to use in in a where statement in SAS.
data want;
set have;
where state in ("TX", "FL");
run;
The resulting dataset is below.
Using Not In to Filter a Dataset in SAS
If you want to create a subset of data based on a variable which has values not in a list of values, we can use the SAS not in operator.
Let’s say we have the same data as above.
Let’s say that we only want to keep records where the state is not Texas (“TX”) or Florida (“FL”).
Below is a example of how to use in in a where statement in SAS.
data want;
set have;
where state not in ("TX", "FL");
run;
The resulting dataset is below.
Hopefully this article has been useful for you to learn how to use the in operator in a where statement in SAS.