Using RETAIN in SAS to remember values4 min read

Like Tweet Pin it Share Share Email

RETAIN in SAS is used to “remember” values from previous observations. Variables that do not come from SAS data sets are, by default, set to a missing value during each iteration of the DATA step.

A RETAIN statement allows you to tell SAS not to set missing values to the variables during each iteration of the data step.

Basic Usage of Retain in SAS

data Example;
 input profit;

data example1;
 set example;

data example1;
 set example;
 retain cum_sum 0;
Retain in SAS
Without Retain Statement
Retain in SAS
With Retain Statement

In the above example, SAS resets the values of cum_sum to missing for each observation.

By adding the RETAIN statement the values of cum_sum are retained for the next iteration.

Points to Remember

  • If you do not specify any variable names, then SAS retains the values of all of the variables created in an INPUT or assignment statement.
  • SAS sets the initial value of a variable to be retained to missing if you don’t specify an initial value.
  • It is also important to understand what retain does and what it does not.

The following items need not require in a RETAIN statement since their values are implicitly retained in a data step.

  • Variables that are read with a SET, MERGE or UPDATE statement.
  • Variables whose value is assigned in a SUM statement
  • Variables that are created by the IN = option

The RETAIN statement is not an executable statement, therefore it can appear anywhere in the DATA step.

Retain in SAS with BY Groups​

For each age group, you want to check if BMI for that age is less than 18.5.

proc sort data=sashelp.bmimen out=bmi;
 by age;

data underweight;
 length underweight $3.;
 set bmi;
 by age;
 retain underweight;

 if first.age then

 if bmi lt 18.5 then

 if last.age then
proc print data=underweight(firstobs=215 obs=225);

This program uses the retained variable Underweight to “remember” if age ever had underweight. I.e BMI of less than 18.5.

As each new age is processed, Underweight is set to No.

Then, if any BMI is less than 18.5, Underweight is set to Yes. Because this value is retained, it remains equal to Yes even if the BMI is greater than 18.5 on all subsequent age.

When SAS reaches the last age group, an observation is written to the output data set.

Using RETAIN in SAS to remember values 1

Calculating Cumulative sum within BY Groups

First and Retain statements can be used together to calculate cumulative sum within each by group.

For example, you would like to determine the cumulative sales within each month. The last observation within the BY group (month) contains the total sales for that month and then resets the calculation for the next month.

data example3;
 do i=1 to 20;
  month=rand('integer', 1, 12);
  sales=rand('integer', 1, 100);
 drop i;

proc sort data=example3;
 by month;

data cum_sum;
 set example3;
 by month;
 retain cum_sum;

 if first.month then
  cum_sum=cum_sum + sales;
Using RETAIN in SAS to remember values 3

The RETAIN statement tells SAS to RETAIN the values of cum_sum for each observation within the BY group.

Generate Serial Number by Group

Another use of FIRST. and LAST. variables with the RETAIN statement is to generate sequential numbers within each BY group.

data count;
 set example3;
 by month;
 retain count;

 if first.month then

In this case, we will start by creating a new variable, COUNTER, and tell SAS to retain that variable. Next, we will initialize the value of COUNTER to 1 at the start of each BY group using the FIRST.MONTH variable. For all remaining observations after the first observation, we will add 1 to the value of COUNTER, always retaining the previous value of COUNTER before adding 1. Here is what the syntax looks like:

Comments (0)

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.