You can use the SAS prxmatch function to perform a regular expression (regex) search on a character variable in a SAS data step, and return the position where the pattern first was found.

data data_new;
   string = "This is a string with some text that we will search and replace with a regex expression.";
   pat = '/(?

When working with strings in our datasets, it can be useful to find if the variables contain certain patterns and substrings.

In SAS, we can use the prxmatch() function to perform regular expression (regex) searches on character variables. The SAS prxmatch() function returns the position of where the pattern first matches in a string. The position is 1 based, and if prxmatch() doesn't find a match, it returns 0.

data data_new;
   string = "This is a string with some text that we will search with a regex expression.";
   pat = '/(?

Finding Patterns in Character Variables with prxmatch() in SAS

Regular expression (regex) searches are very powerful for finding substrings and patterns in string variables. With prxmatch() we can find both simple and complex regex patterns in our character variables.

Let's say we have some strings and want to find all of the words that start with "S".

We can use the following SAS code to find if the strings start with the letter "S" in the following way.

data data_new;
   string1 = "This";
   string2 = "Song";
   pat = '/^S/'; /* pattern for words that start with S */
   found1 = prxmatch(pat,string1) > 0;
   found2 = prxmatch(pat,string2) > 0;
   put found1=;
   put found2=;

/* Output: */
found1 = 0;
found2 = 1;

Hopefully this article has been useful for you to learn how to use the SAS prxmatch() function to perform regex searches on character variables in SAS data steps.

Categorized in:


Last Update: February 26, 2024