Using the Compare function in SAS for comparing strings

The COMPARE function in SAS lets you compare two-character values. With optionally available modifiers, you’ll be able to ignore cases and truncate a longer value to the length of a shorter value before making the comparison.

To demonstrate the COMPARE function, suppose you must verify analysis codes that begin with C450.

One downside is that a few of the data could have the C in lowercase.

You need to match codes that begin with C450 and are followed by a period and, optionally, further digits resembling C450.100.

While this can be a comparatively simple activity using typical DATA step programming, you’ll be able to accomplish the comparison in a single statement using the COMPARE function.

Take a look at the following program:

data test1;
input code $10.; 

data test;
set test1;
if compare(code,'C450','i:') eq 0 then Match = 'Yes';
else Match = 'No';
  • The first two arguments of the COMPARE function are the two character values you
    need to compare.
  • The third argument is the option that lets you specify modifiers.
  • The i modifier is used to ignore the case.
  • The colon (:) modifier is used to truncate the longer string to the length of the shorter string before making the comparison.
Compare Function SAS

COMPARE returns a 0 if there’s a match (after applying the modifiers) and a non-Zero value if the two values differ.

The value returned tells you the first character in the two strings that is different. Observe the compare value for observations 1 and 4. The value 1 for observation 1 tells that the 1st character is different, whereas observation 4 tells that the 2nd character is different.

The sign of this value tells you which of the two values comes first in the collating sequence.

In practice, you merely need to know if the function returns a Zero or not.

Be cautious whenever you use the colon modifier. When SAS computes the shorter string length, it includes trailing blanks.

Here is an example:

data test2; 
String1 = 'ABC'; 
String2 = 'ABCXYZ'; 
Compare1 = compare(String1,String2,':'); 
Compare2 = compare(trim(String1),String2,':'); 
Compare function
  • String1 is ABC followed by trailing blanks. When you use the colon modifier to compare this value to String2, SAS sees the length of both strings as equal to 6.
  • Using the TRIM function to remove the trailing blanks while comparing is always a good practice.
  • For the value of Compare2, SAS trims String2 to a length of 3 (the length of String1 after you strip off the trailing blanks) before making the comparison.

If you are curious about why the value of Compare1 is –4, here is why: The two strings differ in the fourth character. Because a blank comes before a Z in the collating sequence, the value is negative.

Every week we'll send you SAS tips and in-depth tutorials


Subhro Kar is an Analyst with over five years of experience. As a programmer specializing in SAS (Statistical Analysis System), Subhro also offers tutorials and guides on how to approach the coding language. His website, 9to5sas, offers students and new programmers useful easy-to-grasp resources to help them understand the fundamentals of SAS. Through this website, he shares his passion for programming while giving back to up-and-coming programmers in the field. Subhro’s mission is to offer quality tips, tricks, and lessons that give SAS beginners the skills they need to succeed.