SAS Numeric functions are used to carry out tasks such as rounding numbers, computing dates from month-day-year values, summing and averaging the values of SAS variables, and many more.
For, Character functions in SAS read the “The Ultimate Guide To Character Functions In SAS” post.
Before we begin with the numerical functions it is important to understand the Operator hierarchy structure in SAS.
Expressions are used in a number of ways, both within the DATA step and in the PROC step. It is important to remember that, regardless of their use, the evaluation of expression will follow certain rules.
The below table list contains groups of SAS operators according to their priority of evaluation.
Parentheses are used to group operands. Operators within parenthesis are evaluated first.
Evaluation starts from Group 1 and proceeds to Group 2 and last down to Group 7.
Logical and Comparison Operators in Assignment Statements
Logical and comparisons operators can be combined to eliminate multiple
IF ELSE statement.
For example, If we have to group the weight of a person from 1 to 4, based on the person is Male or Female and age group of less than or greater than 90, then using if else condition we have to write logic as below.
if sex = 'M' and weight > 90 then group=1; else if sex = 'M' and weight le 90 then group=2; else if sex = 'F' and weight > 90 then group=3; else if sex = 'F' and weight le 90 then group=4;
Or, we could rewrite the comparisons as below using the logical and comparisons operator.
data class; set sashelp.class; ageGroup= 1*(Sex = 'M' and weight > 90) + 2*(Sex = 'M' and weight le 90)+ 3*(Sex = 'F' and weight > 90) + 4*(Sex = 'F' and weight le 90); run;
An expression with a compound inequality very often contains a variable
between two values, which form the upper and lower limits of a range.
This compound expression is effectively interpreted as two distinct inequalities which are joined with an AND.
if (weight > 80) and (weight < 100);
The above condition can be rewritten as
if 80 lt weight le 100;
The value of age must satisfy both conditions (should be less than 80 and then less than 100 )for the overall expression to evaluate as true.
Misplacing the parentheses totally changes the way that the expression is evaluated.
The inequality inside the parentheses is evaluated to True or False (0 or 1), and the result compared to 100. This expression will be true for all values of weight (both 0 and 1 are less than 100).
In the macro language compound inequalities are not evaluated the same way as in the DATA step.
The resolved macro variable in the expression shows that the result will be false but it evaluates to be TRUE.
%if 80 lt &weight le 100 %then %put less than 80 and 100
This happens because expression in macro language is not broken up into two distinct inequalities instead it is evaluated as if there are parentheses around the first portion of comparison.
%if (80 lt &weight) le 100 %then %put less than 80 and 100
The expression is evaluated left to right, and (80 lt &weight) will be either TRUE or FALSE (1 or 0). Either way, as long as &weight is > 1, the overall expression will be TRUE.
assuming weight = 52 80 < 52 evaluates to 0 or False 0 < 100 evaluates to 1 or True assuming weight = 85 80 < 85 evaluates to 1 or True 1 < 100 evaluates to 1 or True
Hence, the expression always evaluates to True.
Numeric Expressions and Boolean Transformations
Sometimes, you may need to transform numeric values to boolean(0 or 1) values.
The double negation (NOT) is used to perform the transformation of numeric values to 0 or 1. Since negation is a Boolean operator, it converts the original value to either a zero or a one.
The way SAS handles TRUE/FALSE is, false is 0 or missing and all else is true, the missing values must also map to 0.
x= ^age; y=^^age;
The value of X in the above example returns 0 because the value of X is non-missing. If we have any age missing, a single negation would return 1.
You can filter missing value or non-missing values by writing a condition as
if age = . /* To filter only missing ages*/ if age ^= . /* To filter non-missing ages*/
However, the same result can be achieved by using negation operators as below.
if ^age; /* To filter only missing ages*/ if ^^age; /* To filter non-missing ages*/
SAS Numeric functions for replacing missing values to 0
COALESCE function can be used to return the first non-missing value and the second argument to the function would replace the missing value to 0.
If you want to convert all missing values to 0 and all other values to 1 (including 0) you can use the negation of the MISSING function.
data class; set sashelp.class; if age=12 then age=.; z=coalesce(age, 0); m=^missing(age); run;
SAS Numeric functions to Round and Truncate Numeric Values
Two of the most useful SAS numeric functions in this category are the
ROUND is used to round numbers to the nearest integer or to other place values. If a place value is not provided, a default value of 1 is used and the argument is rounded to the nearest integer.
data values; x1=round(331.456,.01); x2=round(331.456); x3=round(331.456,10); x4=round(331.456,50); put x1= x2= x3= x4=; run;
x1=331.46 x2=331 x3=330 x4=350
INT, FLOOR, and CEIL function
INT the function returns the integer part of a numeric value. For, positive argument, the function returns the same result as of the
FLOOR function and for negative value, it returns the same result as of the
CEIL functions return the largest and smallest integer value respectively.
data values; x1=int(23.45); x2=int(-23.45); x3=ceil(23.45); x4=floor(-23.45); put x1= x2= x3= x4= x5=; run;
x1=23 x2=-23 x3=24 x4=-24 x5=23
SAS Numeric functions that work with the Missing Values
A period represents a missing numeric value in SAS whereas a character missing value is represented by a blank.
You can use the MISSING function to determine missing values for both numeric and character. This function returns a value of true or 1 if its argument is a missing value else it returns false or 0.
data values; input @1 var1 3. @5 var2 3.; x1=missing(var1); x2=missing(var2); datalines; 127 988 195 ; run; proc print;
Setting Character and Numeric Values to Missing
With the CALL routine – CALL MISSING you can set one or more character and/or numeric variables to missing. If you are using a variable list such as A1–A10, you need to precede the list with the keyword OF.
data class; set sashelp.class; call missing(name,age); run;
or you can use call missing (of all) to set all variables to missing.
SAS Numeric functions for Descriptive Statistics
There are several SAS numeric functions that you can use to compute statistical results such as means, standard deviation and many other statistical calculations.
The N function
N function returns the number of non-missing numeric values among its arguments.
data UsingN; x1=n(10, 0, ., 20, 50, .); x2=n(10, 0); x3=n(of x1-x2); put x1=x2=x3=; run;
The functions return the number of non-missing values for variables x1,x2, and x3.
x1=4 x2=2 x3=2
NMISS function returns the number of missing values in the list.NMISS requires arguments to be numeric values, whereas CMISS will work both for numeric and character values.
data UsingNmiss; x1=nmiss(10, 0, ., 20, 50, .); x2=nmiss(10, 0); x3=nmiss(of x1-x2); put x1=x2=x3=; run;
The output produced is as below.
x1=2 x2=0 x3=0
MEAN function computes the mean of its arguments.
data UsingMean; x1=mean(2, ., ., 6); put x1=; run;
MAX and MIN functions
MIN are the two SAS numeric functions that can be used to return the largest and smallest value of its arguments.
libname files '/folders/myfolders'; data files.scores; input ID $ Score1-Score10; if n(of Score1-Score10) ge 7 then Score=mean(of Score1-Score10); MaxScore=max(of Score1-Score10); MinScore=min(of Score1-Score10); datalines; A001 4 1 3 9 1 2 3 5 . 3 A002 3 5 4 2 . . . 2 4 . A003 9 8 7 6 5 4 3 2 1 5 ; run; proc print data=files.scores; run;
In, the above example, the mean score is calculated only if the non-missing values are greater than 7. The number of non-missing values for A002 is 6, therefore the mean is not calculated for this ID.
A001 has 9 non-missing values so the mean is computed by adding up the nine values and dividing by 9.
In the below example code, the non-missing and missing values are calculated for the respective ID’s
data scores(keep=ID NonMissingScores_N MissingScores_NMISS); set files.scores; NonMissingScores_N=n(of Score1-Score10); MissingScores_NMISS=nmiss(of Score1-Score10); run;
LARGEST function extracts the nth largest value, given a list of variables.
The first argument of the LARGEST function tells SAS which value you want—1 gives you the largest value, 2 gives you the second largest, and so on.
To find the second-largest score from the above example, you could use the LARGEST function as below.
SMALLEST functions return the smallest value in the list.
Note, if there are duplicate values in the list, the largest or the smallest function would return the values based on the order of variable.
For example, if the values in the list are (4, 1, 3, 9, 1, 2, 3, 5). smallest(2,<list>) would return 1 and not 2.
Similarly, the second largest from the below example would return 789.
Function for calculating the SUM of observations
You could calculate the sum of observations using Score1+Score2+Score3. However, there is a more efficient way of computing the sum of observation in SAS.
SUM function calculates the sum of observations provided in its argument.
sumScore = sum(of score1-score3);
The sum function has an added advantage of ignoring the missing values. For example, if score2 is missing then using the
+ operator would result in a missing value whereas the SUM function calculates and returns the sum of Score1 and score3.
In addition to this, you could set a value that will be returned in case all values are missing. In the below example, 0 will be returned if all the scores are missing.
sumScore = sum(0,of score1-score3);
SAS numeric functions for performing mathematical operations
There are mathematical functions that you can use in SAS to perform operations such as finding the absolute values, square root, log and so on.
ABS function is a straightforward function that is used to find the absolute value of any integer. In other words, it removes the negative sign from the value.
As the name suggests, the
SQRT function will return the square root of its arguments.
LOG function takes the natural logarithm of its argument. You can use the LOG10 function to return a base 10 log.
EXP function raises e (the base of natural logarithms) to the value provided in its argument.
CONSTANT function returns values of commonly used mathematical constants such as pi and e.
You can find the list of all valid constant from the SAS Documentation website.
Another important of this function is to compute the largest integer bytes that can be stored without losing information.
Integer = constant('exactint',3);
The function returns 8192 which means for a length of a numeric variable 3, the largest integer you can represent without losing accuracy is 8,192.
SAS numeric functions for Generating Random Numbers
Random-number functions generate streams of random numbers starting from an initial point called the seed.
This function generates random numbers ranging between 0 and 1.
First, you need to provide a seed number to be used to generate the first number in the random sequence.
If the seed is 0 or a negative number, SAS uses the computer’s clock to supply the seed.
If you choose any positive integer, that number is used as a seed.
SAS recommends that you choose a seed of at least 7 digits. If you supply the seed, the program generates the same sequence of random numbers every time you run the program.
With a 0 seed, the sequence is different every time you run the program.
To generate random integers in the range from 1 to 10, you could use:
RandomInteger = int(ranuni(0)*10 + 1);
data random_numbers; call streaminit(3); do i=1 to 10; simple_random=ranuni(2); uniform_dist=rand('uniform'); normal_dist=rand('normal'); x=rand("Integer", 1, 10); /* For generating random integer between 1 to 5*/ random_int=int(ranuni(0) * 5 + 1); /* For generating random integer between 100 to 500*/ random_int2=int(ranuni(0) * 500 + 100); output; end; drop i; run;
STREAMINIT subroutine is used to define the random number seed for the RAND function in the DATA step. This seed value controls the sequence of random numbers. If you don’t use it SAS will take system date as the initial value.
UNIFORM argument specifies that the numbers generated are uniformly distributed.
You can read more about random numbers distribution from the SAS documentation website.