The column Input in SAS is one of the methods to read raw data files that have spaces or other delimiters between all the values or periods for missing numeric data.
Advantages of Column input in SAS
Column input has the following advantages over list input:
- Spaces are not required between values
- Missing values can be left blank
- Character data can have embedded spaces
- You can skip unwanted variables.
INPUT variable <$> start-column<–end-column> <.decimals> <@ | @@>;
- $ – It indicates that the variable has character values. The $ is not required if the variable is previously defined as a character.
- Start-column – It specifies the first column of the input record that contains the value to read.
- End-column -specifies the last column of the input record that contains the value to read. If the variable value occupies only one column, omit the end column.
- .decimals – specifies the power of 10 by which to divide the value. If the data contain decimal points, the .decimals value is ignored. An explicit decimal point in the input value overrides a decimal specification.
- @ it holds the input record for the execution of the next INPUT statement within the same iteration of the DATA step. This line-hold specifier is called trailing @.
- @@ holds the input record for the execution of the next INPUT statement across iterations of the DATA step. This line-hold specifier is called double trailing @.
The trailing @ and @@ must be the last item in the INPUT statement.
The trailing @ prevents the next INPUT statement from automatically releasing the current input record and reading the next record into the input buffer. It is useful when you need to read from a record multiple times.
The double trailing @ is useful when each input line contains values for several observations.
Example 1: Reading Input Records with Column Input
data Students; input name $ 1-18 age 25-27 weight 30-32; datalines; Joseph 11 32 Mitchel 13 29 Sue Ellen 14 27 ; run;
Example 2: Read Input Records Using Decimals
The .decimals argument is used to read values from input lines and inserting a decimal place into data that does not have an explicit decimal already defined.
If the data contains an explicit decimal it is not changed, instead, the data is padded to match the greatest number of significant digits that occur in any of the output data after conversion.
data product; input cost 1-5 .2; datalines; 26.77 21.00 400 3.2 12.56 9 ; run;
Example 3: Reading two types of Input data
This is an example that reads a file that contains two types of input data records and creates a SAS data set from these records.
One type of data record contains information about a particular college course. The second type of record contains information about the students enrolled in the course.
You will need two INPUT statements to read the two records and to assign the values to different variables that use different formats.
Records that contain class information have a Course in column 1 and records that contain student information have a Student in column 1, as shown below:
----+----1----+----2----+ Course HISTORY Watson Student Williams 0459 Student Flores 5423 Course MATHS Sen Student Lee 7085
data students(drop=type); retain Course Professor; input type $7. @; if type='Course' then input course $ professor $; else if type='Student' then do; input Student $10. Id; output students; end; datalines; Course HISTORY Watson Student Williams 0459 Student Flores 5423 Course MATHS Sen Student Lee 7085 ; run;
Example 4: Holding a Record across Iterations of the DATA Step
This example shows how to create multiple observations for each input data record. Each record contains several NAME and AGE values.
The DATA step reads a NAME value and an AGE value, writes an observation, and then reads another set of NAME and AGE values to output, and so on until all the input values in the record are processed.
data test; input name $ age @@; datalines;Joseph 13 Mitchel 12 Sue 15 Stephen 10 Marc 22 Lily 17;run;
When to use Column Input in SAS?
If each of the variable’s values is always arranged in the same place in the data line, then you use column input as long as all the values are character or standard numeric.
Standard numeric data contains only numerals, decimal points, plus and minus signs, and E representing scientific notation.
If the numbers have embedded commas or dates are not standard numeric data.
If you liked this article, you might also want to read Importing Data using PROC IMPORTas well.
Do you have any tips to add Let us know in the comments?