specify list of variables in sas

How To Specify List Of Variables In SAS?

  • Post author:
  • Post category:Base SAS
  • Post comments:0 Comments
  • Reading time:14 mins read

In SAS, you can quickly specify a list of variables in SAS statements like KEEP and DROP statements, the ARRAY statement, and the OF operator for comma-separated arguments to some functions. You can also use variable lists on the VAR statements of SAS analytic procedures like proc Means and Proc Freq.

This article describes five different ways to specify a list of variables in SAS.

1. Using the _NUMERIC_, _CHARACTER_, and _ALL_ Keywords

Many SAS procedures use a VAR statement to specify the variables to be analyzed. For example, you can use these keywords to analyze variables of a specific type in a function or an array.

_NUMERIC_ Keyword specifies all numeric variables in a dataset.
_CHARCTER_ keyword specifies all character variables in a dataset.
_ALL_ Keyword is used to select all the variables in the dataset.

proc means data=Sashelp.class nolabels;
    var _NUMERIC_;
run;
proc freq data=Sashelp.class;
tables _CHARACTER_;
run;
proc print data=sashelp.class;
var _character_;
run;

2. Select variables with Numeric Sequence

In a numbered range list or a numeric sequence, you can refer to variables that were created in any order, provided that their names have the same prefix.

A single dash (-) is used to specify consecutively numbered variables. For example var1-var54 is same as specifying var1,var2,var3,var5 and var5

For example, suppose you have a dataset with variables names, as in var1, var2, var3, etc.

How To Specify List Of Variables In SAS?

You want to include variables from var2 to var4.

data ex1;
    set inp1(keep=var2-var4);
run;

3. Select contiguous Variables

Name range or contiguous variables uses a double hyphen ( – – ) to define the range between variables based on the order of the variables as they appear in the file, regardless of the name of the variables.

A double dash (–) is used to specify variables based on the order they appear in the file, regardless of the variable names.

Name range lists depend on the order of variable definition, as shown in the following table:

X–A refers to all variables between X and A inclusive, according to the order of the variables in the PDV (program data vector).

X-numeric-A refers to all numeric variables between X and A, inclusive provided X and A are also numeric variables.

X-character-A refers to all numeric variables between X and A, inclusive provided X and A are also character variables.

The following SAS program prints all variables in the SASHELP.HEART datasets, beginning with DeathCause and ending with chol_status, regardless of the name or type of variable.

proc print data= sashelp.heart(obs=10);
var DeathCause--chol_status;
run;
How To Specify List Of Variables In SAS?

In the following example, the name range list specified in the KEEP statement keeps all numeric variables between DeathCause and chol_status. – Not including DeathCause and chol_status since these 2 are character variables.

proc print data= sashelp.heart(obs=10);
var DeathCause-NUMERIC-chol_status;
run;
How To Specify List Of Variables In SAS?

In the following example, the name range list specified in the KEEP statement keeps all character variables between and including DeathCause and chol_status.

proc print data= sashelp.heart(obs=10);
var DeathCause-CHARACTER-chol_status;
run;
How To Specify List Of Variables In SAS?

4. Select variables with the same prefix with the Colon Operator

The colon operator (:) has many uses in SAS. One of its uses is to specify a list of variables that begin with a common prefix. Following a variable name prefix, a colon selects any variable whose name starts with that prefix.

The : (colon), also known as wildcard, can work with the DROP, KEEP, FORMAT, and LENGTH statements.
Below are some of the examples where you can use the : (colon) Operator.

  1. Data Name Wild Card– A colon following a data set name prefix selects all the data sets starting with that prefix. This feature can be used for SET/MERGE or deleting data sets beginning with that prefix. For example, we can apply the colon operator if we want a single data set from several data sets.
data ClassDataset;
set sashelp.cl:;
run;
NOTE: There were 19 observations read from the data set SASHELP.CLASS. 
NOTE: There were 19 observations read from the data set SASHELP.CLASSFIT.
NOTE: There were 486 observations read from the data set SASHELP.CLNMSG.
NOTE: The data set WORK.CLASSDATASET has 524 observations and 16 variables.

This DATA step reads all the data sets in the SASHELP library that begin with cl.  It is important to note that you cannot use DROP and KEEP statements to select variables while using a colon. Also, IN statement will not be possible while combining data sets with the colon.

Another usage of the colon Operator is to delete multiple data sets using the Proc datasets procedure.

proc datasets lib=work;
delete temp_:;
run;
  1. Variable Name Wild Card: Like the data name wild card, a colon following a variable name prefix selects all the variables starting with that prefix. The advantage of this feature is that you can apply it to statements, procedures, and functions.

You can also see our guide on using Colon Operator for Comparing In SAS in the article – Comparison Operator In SAS – The =: Operator.

The below examples show various applications of the colon as variable name wild card.

The variables in the KEEP statement will consider all variables following con and ee, respectively. You can similarly apply the colon to the commonly used LENGTH, DROP, FORMAT, and INFORMAT statements.

data keep_vars;
    set sashelp.citimon(obs=5);
    keep date con: ee:;
run;
How To Specify List Of Variables In SAS?
  1. Application of Colon as Variable Name Wild Card in Statements – The variable name you can apply the wild card to functions (e.g. SUM, MEAN, MAX, MIN) and PROCs (e.g. PRINTUNIVARIATE, FREQ) in the same way as shown below.
data ex;
    set sashelp.citimon(obs=5);
    keep ee: sum_ee;
    sum_ee=sum(of ee:);
run;
How To Specify List Of Variables In SAS?
data ex;
    set sashelp.citimon;
    array var ee:;
    do over var;
        var=var - 1;
    end;
run;

The above code creates an array for all variables beginning with ee and subtracts one from values in variables. This could be very helpful when we must process an unknown number of variables.

5. OF Operator

The OF operator enables you to specify SAS variable lists or SAS arrays as arguments to functions.

  • Name range lists – Function(of x-character-a)
  • Name prefix lists – Function(of x:)
  • Numbered range lists – Function(of x1 – xn)
  • Arrays Function – (of array-name(*))
  • Special SAS name lists – Function(of _numeric_)

In the following example, arguments are passed in as numbered range lists, both with and without using the OF operator.

data ex;
    x1=30;
    x2=20;
    x3=10;
    T=sum(x1-x3);
    T2=sum(OF x1-x3);
run;
How To Specify List Of Variables In SAS?
  1. The first SUM function returns T=20 (30-20)
  2. The second SUM function returns T2=60. (30+20+10)

Arrays and the OF operator

The OF operator can specify variables in an array or a function call.

For example, the following program creates a numerical array named num and a character array named char. Then, the program finds the average marks of students in each row and puts that value into the variable named avg_marks.

The program also creates a variable named FullName that contains the concatenation of the character values for each row, in this case, fname and lname.

data inp1;
    input fname $ lname $ marks1 marks2 marks3 marks4 marks5;
    datalines;
Warren Ross  11 17 34 20 5
Jane Parr 10 15 23 19 10
Lily Rutherford 18 12 16 19 11
;
run;
data Arrays;
set inp1;
array num {*} _NUMERIC_;
array Char {*} _CHARACTER_;
avg_marks=mean(of num[*]);
length FullName $30;
call catx(' ', FullName, of char[*]);
run;

Variable List Short-Cuts in PROC SQL

In PROC SQL, you cannot use the dash in the SELECT clause. Since you can use arithmetic functions in the SELECT clause, SAS would interpret this as a minus sign and subtract the two variables. However, the single dash can be used as a data set option in the FROM clause.

Unfortunately, there is no shortcut in PROC SQL for such a variable list. Attempts to use the double dash will result in a SAS error. 

In the paper – Mimicking the Data Step Dash and Double Dash in PROC SQL, you can refer to some alternatives to using dash and double dash in Proc SQL.

Wrapping it all up

In SAS, you can use dashes to refer to a list of variables without having to type the name of each variable. With dashes and double-dashes, SAS provides flexibility in what variables will be specified in the list. This shorthand method can be used within procedures, keep and drop statements, and arrays.

The general rule for dash and double dash is:

Use a single dash is used to specify consecutively numbered variables.
Use a double-hyphen to specify consecutive variables
Use the : (colon) operator to specify variables with a common prefix.

Thanks for reading!

Every week we'll send you SAS tips and in-depth tutorials

JOIN OUR COMMUNITY OF SAS Programmers!

Subhro

Subhro provides valuable and informative content on SAS, offering a comprehensive understanding of SAS concepts. We have been creating SAS tutorials since 2019, and 9to5sas has become one of the leading free SAS resources available on the internet.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.