Essential guide of using Arrays in SAS7 min read

0 0 vote
Article Rating

Arrays in SAS is a temporary grouping of SAS variables that are arranged in a particular order which are identified by an array name.

Also read: How to sort an array in SAS?

Arrays exist only for the session of the current data step and are referenced by the array name and subscript.

One of the main reasons for using arrays in SAS is to reduce the number of statements that are required for processing variables.


ARRAY array-name{dimension} ;

  • array-name specifies the name of the array.
  • dimensions are the number and arrangement of array elements.
  • elements list are the numeric or character variables to include in the array.

Array Dimensions

In the dimension, you need to specify the number and arrangement of elements to be included in the array. There are several ways to specify the dimension:

In a one-dimensional array, you can simply specify the number of array elements. The array elements are the variables that you want to reference and process elsewhere in the DATA step.

array sales{4} qtr1 qtr2 qtr3 qtr4;

Instead of specifying the number of array elements in the array dimension, you can specify a range of values for the dimension while defining the array. For example, you can define the range for array Sales as follows:

array sales{96:99} totals96 totals97 totals98 totals99;

An asterisk (*) can also be used to specify the dimension of an array.In this way, SAS determines the dimension of the array by counting the number of elements.

array sales{*} qtr1 qtr2 qtr3 qtr4;

Specifying array Elements

When specifying the elements of an array, list each variable name that you want to include in the array. When listing elements, separate each element with space.

array sales{4} qtr1 qtr2 qtr3 qtr4;

Array elements can also be specified as a variable list as in the below example, variables qtr1 through qtr4 is grouped into a one-dimensional array.

array sales{4} qtr1-qtr4;

Array in SAS

Referencing Elements of an Array

A subscript is assigned to each array of elements when you define an array in a data step.

Arrays in SAS

Specifying the name of the array followed by a subscript value enclosed in parenthesis will reference to an array element in the data step.

array-name {subscript}

Subscript specifies variables, or it can be a SAS expression or an integer. It is also within the lower and upper bounds of the dimensions of the array.

The DIM function

The DIM function returns the number of elements in the array.

DIM array-name

Using the arrays

SAS arrays

Using multiple assignments statements.

data emp1;
set mylib.emp;

Using array

data emp1;
set mylib.emp;
array incr{4} qtr1-qtr4;
do i =1 to dim(incr);
incr{i} = incr{i}*1.25;
drop i;
SAS arrays

In the above example, incr array can also be specified by any of the below methods.

  • array incr[4] qtr1 qtr2 qtr3 qtr4
  • array incr{*} qtr1 qtr2 qtr3 qtr4
  • array incr(4) qtr1-qtr4
  • array incr(4) _NUMERIC_

Creating new variables with the ARRAY statement

If you do not specify the elements of the array, SAS automatically creates new variables.

The names of the new variables are obtained by concatenation of the name of the array and numbers 1,2,3…

If you want to create an array of character variables, add a $ symbol after the dimension.

E.g. : ARRAY Months{5} $;

The default length is 8, but you can specify the length you prefer.

E.g.: ARRAY Months{5} $ 20;

You can assign initial values to the elements of an array when you define the array by placing the initial values after the elements, enclosed in parentheses and separated by blanks (if characters, enclose them in quota)on marks).

ARRAY Months{3} m1 m2 m3 (1 2 3);
ARRAY Months{3} $ m1 m2 m3 (‘Jan’ ‘Feb’ ‘Mar’);

Creating variables with arrays

In the above Employee dataset example, if you would have to calculate the percentage contribution of each quarter with respect to the annual contribution for each employee, you can use arrays as below.

data emp1;
 set mylib.emp;
 array incr{4} qtr1-qtr4;
 array pct{4};

 do i=1 to dim(incr);
  Total=SUM(OF incr{*} );
 drop i;
SAS arrays

Here is another example, where I want a new dataset, called WeatherChange, with the variables of the dataset Weather on and 2 more variables (Change1, Change2) which is corresponding to the differences between the temperature of each city for each of the months.

data weatherChange;
 array month{3} January February March;
 array change(2);

 do i=1 to dim(change);
 drop i;
SAS arrays

Temporary arrays

A temporary array only exists for the duration of the data step where it is defined. A temporary array is useful for storing constant values, which are used in calculations.

There are no corresponding variables to identify the array elements in a temporary array.

The elements are defined by the keyword TEMPORARY. When the keyword TEMPORARY is used, a list of temporary data elements is created in the Program Data Vector.

The values do not appear in the output data step but are retained across each iteration of the data step.

Temporary data elements can only be referenced using the array elements since they do not have names.

Explicit array bounds must be specified for temporary arrays and the asterisk subscript cannot be used when defining a temporary array.

data mon;
set ne 0 and month le 6));
array rate {6} $ _temporary_ ('Jan' 'Feb' 'Mar' 'Apr' 'May' 'Jun');
do i = 1 to 6;
 mon = rate{month};

Multi-Dimensional arrays

Multidimensional arrays are useful when you want to group your data into a tabular arrangement having rows and columns.

You can think of the first dimension as rows, the second is for columns. For more dimensions, you need to add another comma and then specify the size for that dimension.

Essential guide of using Arrays in SAS 1

This goes in front of the row subscript. First dimension or rows is the outer most of the nested loops and columns are the inner loop. SAS fills the array starting with the first dimension.

do row = 1 TO 2;
 [Code to process the array]
 [Code to process the data]

In the below example dataset, there are temperatures recorded for 2 cities, for 5 different times. The objective is to round off all the temperature values using a multidimensional array.

data temps;
set mylib.temp_records;
   array temprg{2,5} c1t1-c1t5 c2t1-c2t5;
   do i=1 to dim1(temprg);
     do j=1 to dim2(temprg);
drop i j;
Essential guide of using Arrays in SAS 3

You can also read the article on Advance Array Processing Techniques which has examples on some of the useful uses of Array.

Read: Changing the Case of All Character Variables in a Data Set(Using Arrays)

Download Exercise Files

0 0 vote
Article Rating
Notify of

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Inline Feedbacks
View all comments
Would love your thoughts, please comment.x
Scroll to Top