In SAS, we can check if a variable contains a specific string with the contains operator in a where statement.
data want;
set have;
where variable contains "something";
run;
When working in SAS, the ability to easily be able to create complex filters and get the subsets we desire is valuable.
One such filter which we can create in SAS is filtering a dataset where a variable contains a specific string.
We can check if a variable contains a string in a where statement with the SAS contains() operator.
Let’s say we have following data set which we create with the following data step:
data have;
input animal_type $ gender $ weight age state $ trained $;
datalines;
cat male 10 1 CA no
dog male 20 4 FL no
dog male 30 5 NY no
cat female 40 3 FL yes
cat female 10 2 NY yes
dog female 20 4 TX yes
cat female 50 6 TX yes
dog male 60 1 CA no
dog male 70 5 NY no
cat female 80 4 FL yes
cat female 90 3 TX yes
cat male 100 2 MN no
dog female 80 4 MN no
;
run;
If we to get a subset of all of the records where the state contains “N”, we can do that in a where statement in the following code.
data want;
set have;
where state contains "N";
run;
Below is the resulting dataset.
Checking if SAS Variable Contains Multiple Strings
Unfortunately, there is no way that you can check if a variable contains multiple strings with the contains() operator.
In that case, you will have to use the or operator to string together contains() statements.
Let’s say we have the same dataset as above.
If you want to check for states which contains “N” or “L”, we can do so in the following where statement.
data want;
set have;
where state contains "N" or state contains "F";
run;
Hopefully this article has been useful for you to learn how to subset a dataset in a SAS data step with the SAS contains() operator.