Essential guide of using Arrays in SAS

Arrays in SAS are a temporary grouping of SAS variables arranged in a particular order and identified by an array name.

Arrays exist only for the session of the current data step and are referenced by the array name and subscript.

One of the main reasons for using arrays in SAS is to reduce the number of statements required for processing variables.

Syntax

ARRAY array-name{dimension} 
  • array-name specifies the name of the array.
  • dimensions are the number and arrangement of array elements.
  • elements list are the numeric or character variables to include in the array.

Array Dimensions

In the dimension, you need to specify the number and arrangement of elements included in the array. There are several ways to specify the dimension:

You can specify the number of array elements in a one-dimensional array. The array elements are variables you want to reference and process elsewhere in the DATA step.

array sales{4} qtr1 qtr2 qtr3 qtr4

Instead of specifying the number of array elements in the array dimension, you can specify a range of values for the dimension while defining the array. For example, you can define the range for array Sales as follows:

array sales{96:99} totals96 totals97 totals98 totals99

An asterisk (*) can also be used to specify the dimension of an array. In this way, SAS determines the dimension of the array by counting the number of elements.

array sales{*} qtr1 qtr2 qtr3 qtr4

Specifying array Elements

When specifying the elements of an array, list each variable name you want to include in the array. When listing elements, separate each element with space.

array sales{4} qtr1 qtr2 qtr3 qtr4

As in the example below, array elements can also be specified as a variable list. Variables qtr1 through qtr4 are grouped into a one-dimensional array.

array sales{4} qtr1-qtr4

Array in SAS

Referencing Elements of an Array

A subscript is assigned to each array of elements when you define an array in a data step.

Arrays in SAS

Specifying the name of the array followed by a subscript value enclosed in parenthesis will reference to an array element in the data step.

array-name {subscript}

Subscript specifies variables, or it can be a SAS expression or an integer. It is also within the lower and upper bounds of the array’s dimensions.

The DIM function

The DIM function returns the number of elements in the array.

DIM array-name

Using the arrays

SAS arrays

Using multiple assignment statements.

data emp1;
set mylib.emp;
qtr1=qtr1*1.25;
qtr2=qtr2*1.25;
qtr3=qtr3*1.25;
qtr4=qtr4*1.25;
run;

Using array:

data emp1;
set mylib.emp;
array incr{4} qtr1-qtr4;
do i =1 to dim(incr);
incr{i} = incr{i}*1.25;
end;
drop i;
run;

Output:

SAS arrays

In the above example, incr array can also be specified by any of the below methods.

array incr[4] qtr1 qtr2 qtr3 qtr4

array incr{*} qtr1 qtr2 qtr3 qtr4

array incr(4) qtr1-qtr4

array incr(4) _NUMERIC_

Creating new variables with the ARRAY statement

If you do not specify the elements of the array, SAS automatically creates new variables.

The new variables’ names are obtained by concatenating the array’s name and numbers 1,2,3…

To create an array of character variables, add a $ symbol after the dimension.

E.g. : ARRAY Months{5} $;

The default length is 8, but you can specify your preferred length.

E.g.: ARRAY Months{5} $ 20;

You can assign initial values to the elements of an array when you define the array by placing the initial values after the elements, enclosed in parentheses and separated by blanks (if characters, enclose them in quota)on marks).

E.g.:

ARRAY Months{3} m1 m2 m3 (1 2 3)
ARRAY Months{3} $ m1 m2 m3 (‘Jan’ ‘Feb’ ‘Mar’)

Creating variables with arrays

In the above Employee dataset example, if you would have to calculate each quarter’s percentage contribution concerning each employee’s annual contribution, you can use arrays as below.

data emp1;
 set mylib.emp;
 array incr{4} qtr1-qtr4;
 array pct{4};

 do i=1 to dim(incr);
  incr{i}=incr{i}*1.25;
  Total=SUM(OF incr{*} );
  pct[i]=incr{i}/Total;
 end;
 drop i;
run;

Output:

SAS arrays

Here is another example, where I want a new dataset, called WeatherChange, with the variables of the dataset Weather on and two more variables (Change1, Change2) which correspond to the differences between the temperature of each city for each of the months.

data weatherChange;
 set mylib.weather;
 array month{3} January February March;
 array change(2);

 do i=1 to dim(change);
  change(i)=month(i+1)-month(i);
 end;
 drop i;
run;

Output:

SAS arrays

Temporary arrays

A temporary array only exists for the duration of the data step where it is defined. A temporary array is useful for storing constant values used in calculations.

No corresponding variables exist to identify the array elements in a temporary array.

The elements are defined by the keyword TEMPORARY. When the keyword TEMPORARY is used, a list of temporary data elements is created in the Program Data Vector.

The values do not appear in the output data step but are retained across each data step iteration.

Temporary data elements can only be referenced using the array elements since they do not have names.

Explicit array bounds must be specified for temporary arrays, and the asterisk subscript cannot be used when defining a temporary array.

data mon;
set sashelp.holiday(where=(month ne 0 and month le 6));
array rate {6} $ _temporary_ ('Jan' 'Feb' 'Mar' 'Apr' 'May' 'Jun');
do i = 1 to 6;
 mon = rate{month};
 end;
run;

Multi-Dimensional arrays

Multidimensional arrays are useful when you group your data into a tabular arrangement with rows and columns.

You can think of the first dimension as rows. The second is for columns. For more dimensions, you need to add another comma and then specify the size for that dimension.

Essential guide of using Arrays in SAS

This goes in front of the row subscript. The first dimension or rows is the outermost of the nested loops, and columns are the inner loop. SAS fills the array starting with the first dimension.

do row = 1 TO 2;
 [Code to process the array]
 DO COLUMN 1 TO 5;
 [Code to process the data]
 end;
 end;

The below example dataset records temperatures for two cities five different times. The objective is to round off all the temperature values using a multidimensional array.

data temps;
set mylib.temp_records;
   array temprg{2,5} c1t1-c1t5 c2t1-c2t5;
   do i=1 to dim1(temprg);
     do j=1 to dim2(temprg);
       temprg{i,j}=round(temprg{i,j});
     end;
   end;
drop i j;
run;

Output:

Essential guide of using Arrays in SAS

You can also read the article on Advance Array Processing Techniques, which has examples of some of the useful uses of Array.Read: Changing the Case of All Character Variables in a Data Set(Using Arrays)

Every week we'll send you SAS tips and in-depth tutorials

JOIN OUR COMMUNITY OF SAS Programmers!

Subhro Kar is an Analyst with over five years of experience. As a programmer specializing in SAS (Statistical Analysis System), Subhro also offers tutorials and guides on how to approach the coding language. His website, 9to5sas, offers students and new programmers useful easy-to-grasp resources to help them understand the fundamentals of SAS. Through this website, he shares his passion for programming while giving back to up-and-coming programmers in the field. Subhro’s mission is to offer quality tips, tricks, and lessons that give SAS beginners the skills they need to succeed.