SAS format is the instruction that specifies how the value of a variable should be printed or displayed and SAS informats are the specification for how raw data should be read.
What is a SAS format?
SAS Format tells SAS how to print the variable. Formats and informats for a variable may or may not be the same for the variable.
There are many inbuilt SAS formats and informats and you can also create your own formats and informats using the
PROC FORMAT procedure.
SAS Formats and informats are of three main types. These are Character, Numeric and dates.
|Type||Informat Name||What it Does|
|Character||Reads in character data of length w.|
|Numeric||Reads in numeric data of length w with d decimal points|
|Date||Reads in date data in the form of mm-dd-yy|
For a complete list of built-in formats, you can refer SAS 9.4 Formats and Informats: Reference.
Every variable in SAS will have a format – whether you assign one or you let SAS assign one automatically.
SAS informats are declared when you are reading in data or creating a new variable in a data step, whereas the format statement can be used in either a data step or a proc step:
- The variable format is permanently stored to a variable in a data step.
- The Variable format in a proc step will temporarily store the variable’s format assignment for the proc step it is used in.
FORMAT variable-name <$>FORMAT-NAME.;
$ → indicates a character format; its absence indicates a numeric format.
- w → specifies the format width that is the number of columns.
- d → It is used to specify the width of the decimal part.
SAS Format always contains a period (.) as a part of the name. Default values are used if you omit the w and the d values from the format.
The d value that you specify with SAS formats indicates the number of decimal places. The internal data values are not truncated or change with the formats.
For example, in
DOLLAR10.2, the w value of 10 specifies to print a maximum of 10 columns for the value and the d value of 2 specifies that two of these columns are for the decimal part of the value. This leaves eight columns for all the remaining characters in the value.
The remaining columns include the decimal point, the remaining numeric value, a minus sign if the value is negative, the dollar sign, and commas if any.
SAS tries to adjust the value into the space available if the format width is too narrow to represent a value. If adequate is not specified, SAS prints asterisks. In the following example, the result is x=**.
x=123; put x= 2.;
Character formats truncate values on the right. Numeric formats sometimes revert to the
If incompatible SAS formats are used, such as using a numeric format to write character values, SAS first attempts to use a comparable format of the other type. If this attempt fails, an error message that describes the problem appears in the SAS log.
When the value of d is greater than fifteen, the precision of the decimal value after the 15th decimal place might not be accurate.
Ways to Specify SAS Formats
You can use formats in the following ways:
- In a PUT statement
- With the PUT, PUTC, or PUTN functions
- With the %SYSFUNC macro function
- In a FORMAT statement in a DATA step or a PROC step
- In an ATTRIB statement in a DATA step or a PROC step.
The PUT statement with a format after the variable name uses a format to write data values in a DATA step.
For example, the PUT statement with the
DOLLARw.d format can be used to write the numeric value for AMOUNT as a dollar amount:
amount=1145.32; put amount dollar10.2;
The DOLLARw.d format in the PUT statement produces this result as $1,145.32
PUT function is used to convert a numeric variable to a character variable that returns the resulting character value.
For example, the following statement converts the value of a numeric variable into two-character hexadecimal representation.
The PUT function returns a value of 0A, that is assigned to the variable CHAR. The PUT function is also useful for converting a numeric value to a character value.
The %SYSFUNC (or %QSYSFUNC) macro function executes SAS functions and applies the format to the result of the function outside of the DATA step.
For example, the below program writes a numeric value in a macro variable as a dollar amount.
%macro test(amount); %put %sysfunc(putn(&amount,dollar10.2)); %mend test; %test (1154.23);
SAS Formats can be used with the
FORMAT statement. Using Format statement permanently associates a format with a variable. SAS uses the format to write the values of the variable that you specify.
For example, the following statement in a DATA step associates the
COMMAw.d numeric format with the variables SALES1 through SALES3:
format sales1-sales3 comma10.2;
Note: If you assign formats with a FORMAT statement before a PUT statement, all leading blanks are trimmed.
Formats that are associated with variables by using a FORMAT statement behave like formats that are used with a colon (:) modifier in a subsequent PUT statement.
The ATTRIB statement can also be used to specify a format for one or more variables.
For example, in the following statement, the ATTRIB statement permanently applies the COMMAw.d format for the variables SALES1 through SALES3:
format sales1-sales3 comma10.2;
Permanent and Temporary SAS Formats
When you specify a format in a PUT statement, SAS uses the format to write data values during the DATA step but does not permanently apply the format with that variable.
To permanently apply a format to a variable, you can use a FORMAT statement or an ATTRIB statement in a DATA step.
Using a FORMAT statement or an ATTRIB statement in a PROC step associates a format with a variable for that PROC step, as well as for any output data sets that the procedure creates that contain formatted variables.
SAS date formats
The date formats can be used wherever formats are supported in the SAS language and the methods for applying remains the same as discussed earlier.
Built-in SAS Date formats
The table below shows how the data value corresponding with August 15, 2019 (SAS date 21776) would appear when the format is applied.
Built-in SAS time formats
The table below shows how the time value corresponding with 15 seconds after 1:14 PM (SAS time 47655) would appear when the format is applied.
Built-in SAS DateTime formats
The table below shows how the DateTime value corresponding with 15 seconds after 1:14 PM on December 31, 2018 (SAS DateTime 1,861,881,255) would appear when the format is applied.
An informat is an instruction that SAS uses to read data values into a variable. For example, the following value contains a dollar sign and commas: $1,000,000
You can remove the dollar sign ($) and commas (,) before storing the numeric value 1000000 in a variable, by applying the COMMA11. informat to this value.
$ indicates a character informat; its absence indicates a numeric informat.
informat names the informat. The informat is a SAS informat or a user-defined informat that was previously defined with the INVALUE statement in PROC FORMAT.
- w specifies the informat width, which for most informats is the number of columns in the input data.
- d specifies an optional decimal scaling factor in the numeric informats. SAS divides the input data by 10 to the power of d.
Unless you explicitly define a variable first, SAS uses the informat to determine whether the variable is numeric or character. SAS also uses the informat to determine the length of character variables.
Even though SAS can read up to 32 digits when you specify some numeric informats, numbers with more than 15 significant digits might lose precision due to the limitations of the eight-byte floating-point representation used by most computers.
Informats always contain a period (.) as a part of the name. If you omit the w and the d values from the informat, SAS uses default values. If the data contain decimal points, SAS ignores the d value and reads the number of decimal places that are actually in the input data.
If the informat width is too narrow to read all the columns in the input data, you might get unexpected results.
This problem frequently occurs with the date and time informats. To avoid this you must adjust the width of the informat to include blanks or special characters between the day, month, year, or time.
When a problem occurs with an informat, SAS writes a note to the SAS log and assigns a missing value to the variable. Problems usually occur if you use an incompatible informat, such as a numeric informat to read character data, or if you specify the width of a date and time informat that causes SAS to read a special character in the last column.
- in an INPUT statement
- with the INPUT, INPUTC, and INPUTN functions
- in an INFORMAT statement in a DATA step or a PROC step
- in an ATTRIB statement in a DATA step or a PROC step.
The INPUT statement with an informat after a variable name is the simplest way to read values into a variable. For example, the following INPUT statement uses two informats:
data test; FORMAT DT MMDDYY8.; input @1 id $3. @4 price comma10.3 @14 dt mmddyy10.; datalines; ID1 $1250.03 02/14/2020 ; run;
The $w. character informat reads values into the variable id. The commaw.d numeric informat reads values into the variable PRICE and mmddyy8. reads date informat into the dt variable.
The INPUT function is used to convert a SAS character expression using a specified informat. The informat determines whether the resulting value is numeric or character.
In the above example, the INPUT function is used in combination with the w.d informat that converts the character value of TempCharacter to a numeric value and assigns the numeric value 98.6 to TemperatureNumber.
You can use the PUT function with a SAS format to convert numeric values to character values. For a complete discussion of the INPUT function, visit https://9to5sas.com/variable-conversions-in-sas/
The INFORMAT statement applies an informat to a variable. SAS uses the specified informat in any subsequent INPUT statement to read values into the variable.
For example, in the following statements, the INFORMAT statement associates the $3 informat with the variables id, comma10.3 with variable price and mmddyy10. for variable dt.
data test; informat id $3. price comma10.3 dt mmddyy10.; FORMAT DT MMDDYY8.; input id price dt; datalines; ID1 $1250.03 02/14/2020 ; run;
An informat that is associated with an INFORMAT statement behaves like an informat that you specify with a colon (:) format modifier in an INPUT statement.
Therefore, SAS uses a modified list input to read the variable so that the w value in an informat does not determine column positions or input field widths in an external file
The blanks that are embedded in input data are treated as delimiters unless you change the DLM= or DLMSTR= option in an INFILE statement.
For character informats, the w value in an informat specifies the length of the character variable for numeric informats, the w value is ignored.
If the INPUT statements use another style of input such as formatted input or column input that style of input is not used when you use the INFORMAT statement.
The ATTRIB statement can also associate an informat, as well as other attributes, with one or more variables. For example, in the following statements, the ATTRIB statement associates the DATEw. informat with the variables Birthdate and Interview:
attrib Birthdate Interview informat=date9.; input @63 Birthdate Interview;
If the informat is applied by using the INFORMAT= option in the ATTRIB statement it behaves like an informat that you specify with a colon (:) format modifier in an INPUT statement.
In this case, SAS uses a modified list input and reads the variables are read in the same ways as it does for the INFORMAT statement.
SAS does not permanently apply the informat to the variable unless an an INFORMAT statement or an ATTRIB statement is used.
Sometimes we may have formatted text that represents a date and/or time information, and we wish to store this information in a SAS data set as date, time, or datetime values.
SAS DateTime Informats
Most of the informats listed above provide some degree of flexibility in the exact formatting of the text they read.
For example, the TIME. informat will work with colons separating the hours, minutes, and seconds, but will also work with periods, slashes, dashes, and other characters.
It also works irrespective of “PM” is in uppercase or lowercase.
For the more in flexibility, SAS also includes a set of three “ANY” informats that will correctly interpret a wide variety of formatted date and time information.