To remove whitespace or characters from a string in a SAS data step, we can use the SAS Compress function. By default, the SAS compress function removes all spaces from a string.

data k;
    a = 'this is a string with some blanks';
    b = compress(a);
    put b=;
run;

/* Output: */
b=thisisastringwithsomeblanks

The SAS compress function can be incredibly useful for cleaning up your SAS files, but also for removing unwanted characters from a character variable.

You can use the SAS compress function returns a character string with specified characters removed from the original string and has many different options for you to customize the returned string.

The SAS compress function takes three arguments: the source, the characters you want to compress, and any modifiers you’d like to add.

Using SAS Compress to Remove All Whitespace from a String in SAS

To remove all blanks from a string in SAS, you can use the SAS compress function without any additional arguments.

Removing all blanks and whitespace is easy, and you can see how we can remove all blanks from a string in the following SAS code:

data k;
    a = 'this is a string with some blanks';
    b = compress(a);
    put b=;
run;

/* Output: */
b=thisisastringwithsomeblanks

Removing All Numbers from a String in SAS

You can also use the SAS Compress function to remove all numbers from a string in your SAS data steps.

To remove digits from a string using the SAS compress function, you can do so by passing ‘1234567890’ in the second argument to the “characters” argument, or pass “d” in the third argument to the “modifier” argument.

data k;
    a = 'this is a 2string 4with 6some 8numbers';
    b = compress(a,"1234567890");
    put b=;
    c = compress(a, ,'d');
    put c=;
run;

/* Output: */
b=this is a string with some numbers
c=this is a string with some numbers

Removing all Lowercase or Uppercase Letters from a String in SAS

We can also use the SAS Compress function to remove all lowercase or uppercase letters from a string variable in a data step.

To do so, we can use the “l” or “u” modifiers to remove lowercase or uppercase letters from a character variable.

data k;
    a = 'This Is A String With Some Uppercase And Lowercase Words. ';
    b = compress(a, ,"u");
    put b=;
    c = compress(a, ,'l');
    put c=;
run;

/* Output: */
b=his s  tring ith ome ppercase nd owercase ords.
c=T I A S W S U A L W.

Removing Certain Letters from a String in SAS

We can use the SAS compress function to remove certain characters from a string variable in a data step as well.

Let’s say we want to remove only the a’s and b’s from a string. In this case, we are not worried about case-sensitivity.

data k;
    a = 'Alfred and Betty went to the beach to play with a ball. ';
    b = compress(a,"abAB");
    put b=;
run;

/* Output: */
b=lfred nd etty went to the ech to ply with  ll.

Keeping Certain Letters instead of Removing with SAS Compress

We can also use the SAS compress function to keep certain characters instead of removing them.

Let’s say we want to keep all of the a’s and b’s in the following string. We can pass ‘k’ to the modifier argument to keep the a’s and b’s.

data k;
    a = 'abcde abbaecd deebcabc ebcabcbadebac dbebacbde';
    b = compress(a,"ab",'k');
    put b=;
run;

/* Output: */
b=ababbababbabbababbab

The Difference between compress(), trim(), and strip() in SAS

When working with string variables in SAS, there are a few useful functions for cleaning up whitespace and removing blanks.

Two other functions in addition to the SAS compress() function are the SAS trim() and SAS strip() functions.

The SAS trim() function gives us the ability to remove all trailing blank spaces from a string, and the SAS strip() function removes both leading and trailing blank spaces from a string.

You can see below how each of these string manipulation functions work in the following SAS code:

data k;
    a = '     abc de fghi jkl         mnop     ';
    trim =  "*" || trim(a) || "*";
    comp =  "*" || compress(a) || "*";
    strip =  "*" || strip(a) || "*";
    put trim=;
    put comp=;
    put strip=;
run;

/* Output: */
trim=*     abc de fghi jkl         mnop*
comp=*abcdefghijklmnop*
strip=*abc de fghi jkl         mnop*

Hopefully this article has been useful for you to understand how to use the SAS Compress function in a data step to remove whitespace, blanks, and other characters from a string,

Categorized in:

SAS,

Last Update: March 20, 2024