In SAS, we can use the SAS scan() function to parse strings and extract words from the location we desire. For example, we can use the SAS scan() function to find the nth word in any character string.
data data;
string = "This is a string for an example";
x = scan(string,1); /* x = "This" */
run;
We can also use the scan function with the SAS Macro Language to parse macro string variables and extract words.
%let string = This is a string of words.;
%let first_word = %scan(&string,1);
%put &first_word; /* Log shows This */
Parsing strings is very useful when programming and working with data. SAS provides us a very useful function to parse strings easily.
The SAS scan() function parses strings given a delimiter, and gives us an easy way to interact with each of the elements of this new “array” (it is similar to the “split()” function in other languages like Python or Javascript).
For example, to get the first word of character string, we pass the string and 1 to scan().
data data;
string = "This is a string for an example";
x = scan(string,1); /* x = "This" */
run;
How to Get the nth or last Word of String in SAS
Getting different elements from the scan() function is easy and allows us to get the nth word in a string without too much trouble.
To get the nth word of a string, you just need to pass ‘n’ to the scan() function. For example, if I want the 3rd word, I pass ‘3’ to scan():
data data;
string = "This is a string for an example";
x = scan(string,3); /* x = "a" */
run;
To work from the end of the string, we can pass negative numbers. For example, if we want to get the last word of a string, we can pass “-1” to scan():
data data;
string = "This is a string for an example";
x = scan(string,-1); /* x = "example" */
run;
How to Change Delimiter in with scan() Function in SAS
When working with data, it is common that you will come across strings delimited in different ways. Sometimes you will want to parse sentences, but other times, you make need to parse a comma delimited variable, or a variable with another delimiter.
To change the delimiter in the scan() function, we just pass an additional parameter to it.
That being said, the SAS scan() function, by default, checks for the following common delimiters:
blank ! $ % & ( ) * + , - . / ; < ^ :
For example, let's say we have a string delimited by commas. To find the fourth word of this string, we pass "4" to the scan() function and the scan() function works as expected.
data data;
string = "This,is,a,string,for,an,example";
x = scan(string,4); /* x = "string" */
run;
Something interesting to note is that if there are other common delimiters in the string, you might get some interesting results.
For example, if we take the same string and replace one of the commas with an exclamation point, we still get the same answer.
data data;
string = "This,is,a!string,for,an,example";
x = scan(string,4); /* x = "string" */
run;
This may or may not be what you want, and so to be sure, if you want to treat a certain delimiter as the delimiter, you can specify as a third parameter.
data data;
string = "This,is,a!string,for,an,example";
x = scan(string,4,","); /* x = "for" */
run;
Using the SAS scan() function in the SAS Macro Language
With the SAS Macro Language, we can create complex programs which can be dynamic and effective for getting a lot done.
We can use the Macro Language version of scan() to scan macro character variables and find the nth word in that list.
%let string = This is a string of words.;
%let first_word = %scan(&string,1);
%put &first_word; /* Log shows This */
I find the SAS macro scan() function to be most useful in loops. For example, if I want to loop over a list, I'll scan the list and then do things depending on where I am in the list.
%let string = This is a string of words.;
%macro scan_example;
%do i = 1 %to %sysfunc(countw(&string));
%let current_word = %scan(&string,&i);
%if ¤t_word = string %then %do;
/* Do stuff here */
%end;
%end;
%mend;
Hopefully this article has been beneficial for you to learn how to use the SAS scan() function in your data steps, as well as in the SAS Macro Language.