SAS Numeric functions are used to carry out tasks such as rounding numbers, computing dates from month-day-year values, summing and averaging the values of SAS variables, and many more.
For, Character functions in SAS, read the “The Ultimate Guide To Character Functions In SAS” post.
Before we begin with the numerical functions, it is essential to understand the Operator hierarchy structure in SAS.
Expressions are used in several ways within the DATA and PROC steps. It is important to remember that the expression evaluation will follow specific rules regardless of their use.
The table below contains groups of SAS operators according to their evaluation priority.
Parentheses are used to group operands. Operators within parenthesis are evaluated first.
Evaluation starts from Group 1, proceeds to Group 2, and lasts down to Group 7.
Logical and Comparison Operators in Assignment Statements
Logical and comparison operators can be combined to eliminate multiple IF ELSE statements.
For example, If we have to group a person’s weight from 1 to 4, based on whether the person is Male or Female and age group of less than or greater than 90, then using if else condition, we have to write logic as below.
if sex = 'M' and weight > 90 then group=1; else if sex = 'M' and weight le 90 then group=2; else if sex = 'F' and weight > 90 then group=3; else if sex = 'F' and weight le 90 then group=4;
Or, we could rewrite the comparisons as below using the logical and comparisons operator.
data class; set sashelp.class; ageGroup= 1*(Sex = 'M' and weight > 90) + 2*(Sex = 'M' and weight le 90)+ 3*(Sex = 'F' and weight > 90) + 4*(Sex = 'F' and weight le 90); run;
An expression with a compound inequality very often contains a variable
between two values, which form a range’s upper and lower limits.
This compound expression is effectively interpreted as two distinct inequalities with an AND.
if (weight > 80) and (weight < 100);
The above condition can be rewritten as
if 80 lt weight le 100;
The age value must satisfy both conditions (should be less than 80 and then less than 100 )for the overall expression to evaluate as true.
Misplacing the parentheses changes the way that the expression is evaluated.
The inequality inside the parentheses is evaluated to True or False (0 or 1), and the result is compared to 100. This expression will be true for all values of weight (both 0 and 1 are less than 100).
In the macro language, compound inequalities are not evaluated the same way as in the DATA step.
The resolved macro variable in the expression shows that the result will be false, but it evaluates as TRUE.
%if 80 lt &weight le 100 %then %put less than 80 and 100
This happens because expression in macro language is not broken into two distinct inequalities. Instead, it is evaluated as if there are parentheses around the first portion of the comparison.
%if (80 lt &weight) le 100 %then %put less than 80 and 100
The expression is evaluated left to right, and (80 lt &weight) will be either TRUE or FALSE (1 or 0). If “&weight” is > 1, the overall expression will be TRUE.
assuming weight = 52
80 < 52 evaluates to 0 or False
0 < 100 evaluates to 1 or True
assuming weight = 85
80 < 85 evaluates to 1 or True
1 < 100 evaluates to 1 or True
Hence, the expression continuously evaluates to True.
Numeric Expressions and Boolean Transformations
Sometimes, you may need to transform numeric values to boolean(0 or 1) values.
The double negation (NOT) is used to transform numeric values to 0 or 1. Since negation is a Boolean operator, it converts the original value to either a zero or a one.
SAS handles TRUE/FALSE: false is 0 or missing, and all else is true. The missing values must also map to 0.
x= ^age; y=^^age;
The value of X in the above example returns 0 because the value of X is non-missing. If we have any age missing, a single negation will return 1.
You can filter missing values or non-missing values by writing a condition as
if age = . /* To filter only missing ages*/ if age ^= . /* To filter non-missing ages*/
However, the same result can be achieved using negation operators as below.
if ^age; /* To filter only missing ages*/ if ^^age; /* To filter non-missing ages*/
SAS Numeric functions for replacing missing values to 0
If you want to convert all missing values to 0 and all other values to 1 (including 0), you can use the negation of the MISSING function.
data class; set sashelp.class; if age=12 then age=.; z=coalesce(age, 0); m=^missing(age); run;
SAS Numeric functions to Round and Truncate Numeric Values
Two of the most useful SAS numeric functions in this category are the
data values; x1=round(331.456,.01); x2=round(331.456); x3=round(331.456,10); x4=round(331.456,50); put x1= x2= x3= x4=; run;
x1=331.46 x2=331 x3=330 x4=350
INT, FLOOR, and CEIL function
INT the function returns the integer part of a numeric value. For a positive argument, the function returns the same result as the FLOOR function, and for a negative value, it returns the same result as of the
CEIL functions return the largest and smallest integer value, respectively.
data values; x1=int(23.45); x2=int(-23.45); x3=ceil(23.45); x4=floor(-23.45); put x1= x2= x3= x4= x5=; run;
x1=23 x2=-23 x3=24 x4=-24 x5=23
SAS Numeric functions that work with the Missing Values
A period represents a missing numeric value in SAS, whereas a character missing value is represented by a blank.
You can use the MISSING function to determine missing values for both numeric and character. This function returns a value of true or one if its argument is a missing value. Else it returns false or 0.
data values; input @1 var1 3. @5 var2 3.; x1=missing(var1); x2=missing(var2); datalines; 127 988 195 ; run; proc print;
For more information, see our guide on Working with Missing Values in SAS
Setting Character and Numeric Values to Missing
With the CALL routine – CALL MISSING, you can set one or more character and/or numeric variables to missing. If you use a variable list such as A1–A10, you must precede the list with the keyword OF.
data class; set sashelp.class; call missing(name,age); run;
or you can use call missing (of all) to set all variables to missing.
SAS Numeric functions for Descriptive Statistics
There are several SAS numeric functions that you can use to compute statistical results such as means, standard deviation and many other statistical calculations.
Also Read: Descriptive Statistics in SAS with Examples
The N function
N function returns the number of non-missing numeric values among its arguments.
data UsingN; x1=n(10, 0, ., 20, 50, .); x2=n(10, 0); x3=n(of x1-x2); put x1=x2=x3=; run;
The functions return the number of non-missing values for variables x1,x2, and x3.
x1=4 x2=2 x3=2
NMISS function returns the number of missing values in the list.NMISS requires arguments to be numeric values, whereas CMISS will work both for numeric and character values.
data UsingNmiss; x1=nmiss(10, 0, ., 20, 50, .); x2=nmiss(10, 0); x3=nmiss(of x1-x2); put x1=x2=x3=; run;
The output produced is as below.
x1=2 x2=0 x3=0
MEAN function computes the mean of its arguments.
data UsingMean; x1=mean(2, ., ., 6); put x1=; run;
MAX and MIN functions
<strong>MIN</strong> are the two SAS numeric functions that can be used to return the largest and smallest value of its arguments.
libname files '/folders/myfolders'; data files.scores; input ID $ Score1-Score10; if n(of Score1-Score10) ge 7 then Score=mean(of Score1-Score10); MaxScore=max(of Score1-Score10); MinScore=min(of Score1-Score10); datalines; A001 4 1 3 9 1 2 3 5 . 3 A002 3 5 4 2 . . . 2 4 . A003 9 8 7 6 5 4 3 2 1 5 ; run; proc print data=files.scores; run;
The above example calcifies the mean score only if the non-missing values are greater than 7. The number of non-missing values for A002 is 6, the mean is not calculated for this ID.
A001 has nine non-missing values, so the mean is computed by adding the nine values and dividing by 9.
In the below example code, the non-missing and missing values are calculated for the respective IDs
data scores(keep=ID NonMissingScores_N MissingScores_NMISS); set files.scores; NonMissingScores_N=n(of Score1-Score10); MissingScores_NMISS=nmiss(of Score1-Score10); run;
LARGEST function extracts the nth largest value, given a list of variables.
The first argument of the LARGEST function tells SAS which value you want—1 gives you the largest value, 2 gives you the second largest, and so on.
To find the second-largest score from the above example, you could use the LARGEST function as below.
<strong>SMALLEST</strong> functions return the smallest value in the list.
Note, if there are duplicate values in the list, the largest or the smallest function will return the values based on the order of the variable.
For example, if the values in the list are (4, 1, 3, 9, 1, 2, 3, 5). smallest(2,<list>) would return 1, not 2.
Similarly, the second largest from the below example would return 789.
Function for calculating the SUM of observations
You could calculate the sum of observations using Score1+Score2+Score3. However, there is a more efficient way of computing the sum of observations in SAS.
SUM function calculates the sum of observations provided in its argument.
sumScore = sum(of score1-score3);
The sum function has the added advantage of ignoring the missing values.
For example, if score2 is missing, then using the
+ operator would result in a missing value, whereas the SUM function calculates and returns the sum of Score1 and score3.
In addition, you could set a value that will be returned in case all values are missing. In the below example, 0 will be returned if all the scores are missing.
sumScore = sum(0,of score1-score3);
SAS numeric functions for performing mathematical operations
You can use mathematical functions in SAS to perform operations such as finding the absolute values, square root, log and so on.
ABS function is a straightforward function used to find any integer’s absolute value. In other words, it removes the negative sign from the value.
As the name suggests, the
SQRT function will return the square root of its arguments.
<strong>LOG</strong> function takes the natural logarithm of its argument. You can use the LOG10 function to return a base 10 log.
EXP function raises e (the base of natural logarithms) to the value provided in its argument.
CONSTANT function returns values of commonly used mathematical constants such as pi and e.
You can find the list of all valid constants on the SAS Documentation website.
Another important of this function is to compute the largest integer bytes that can be stored without losing information.
Integer = constant('exactint',3);
The function returns 8192, which means for a length of a numeric variable 3, the largest integer you can represent without losing accuracy is 8,192.
SAS numeric functions for Generating Random Numbers
Random-number functions generate streams of random numbers starting from an initial point called the seed.
This function generates random numbers ranging between 0 and 1.
First, you need to provide a seed number to be used to generate the first number in the random sequence.
If the seed is 0 or a negative number, SAS uses the computer’s clock to supply the seed.
If you choose any positive integer, that number is used as a seed.
SAS recommends that you choose a seed of at least seven digits. If you supply the seed, the program generates the same sequence of random numbers every time you run.
With a 0 seed, the program’s sequence is different every time you run.
To generate random integers in the range from 1 to 10, you could use:
RandomInteger = int(ranuni(0)*10 + 1);
data random_numbers; call streaminit(3); do i=1 to 10; simple_random=ranuni(2); uniform_dist=rand('uniform'); normal_dist=rand('normal'); x=rand("Integer", 1, 10); /* For generating random integer between 1 to 5*/ random_int=int(ranuni(0) * 5 + 1); /* For generating random integer between 100 to 500*/ random_int2=int(ranuni(0) * 500 + 100); output; end; drop i; run;
STREAMINIT subroutine is used to define the random number seed for the RAND function in the DATA step. This seed value controls the sequence of random numbers. If you don’t use it, SAS will take the system date as the initial value.
UNIFORM argument specifies that the numbers generated are uniformly distributed.
You can read more about random numbers distribution from the SAS documentation website.