In the example dataset – SASHELP.HEART, you want to create a new variable – Activity_status and assign one of three values of (‘High’, ‘MEDIUM’, ‘LOW’), at random to each observation. You can achieve these using 2 methods. Method 1: Using an array data heart(drop=id _1-_3 keep=status sex bp_status chol_status activity_status); set sashelp.heart(obs=10); array _{3} $…
This program will change the case for all of the character variables in a SAS data set. The key here is using the _CHARACTER_ keyword in the ARRAY statement. This will create an array of all the character variables in the dataset. Once the array is created, you can apply any character functions on all…
Removing dashes and parentheses from phone numbers is one of the common data cleaning activities. This program uses the compress function to remove unwanted characters from a phone number. data phone_number; input Phone $ 1-20; Phone_number=compress(Phone, , 'kd'); datalines; (908)235-4490 (201) 555-77 99 ; title 'Removing dashes and parentheses from phone numbers'; proc print data=phone_number;…
Count missing and Non-missing values for each variable – In SAS, we often need to get the count of missing and non-missing values in a SAS dataset. The code used in this example uses PROC FORMAT to create the format for character and numeric variables to be either “non-missing” or “missing” and then use that…
Have you ever needed to know if a given variable exists is SAS data set? Below is a macro which will check for the existence of a variable in a dataset and will return a value of 0 if the variable does not exist and the column number if the variable exists. The macro has…
The macro is dynamic to generate multiple Excel files containing the frequency results for each unique BY-group. (e.g., Type). In this SAS example, PROC SQL SELECT code is embedded inside a macro. The SELECT query processes the column (passed in the macro parameter) then create a macro variable with the number of unique (distinct) value…
The FINFO functions return the value of a file specified. You can get six attributes named ‘info items’ through the use of the FINFO function. To read multiple within a directory, the DOPEN, DNUM and DREAD functions can be used. The DOPEN function is similar to the FOPEN function and it returns a directory identifier…
Rolling Data also known as Moving average is a time-based calculations to get an insight of trends for a defined period of time. If time frame for moving average is 12 months, on each month the data which is 13 month old be dropped and the new months data will be added. In the below…