Base SAS Programming - Fundamentals

This post is a comprehensive guide to SAS programming. It covers the basics of the SAS language, including data management, data manipulation, and various data analysis tasks.

Additionally, we will introduce the key components of SAS, such as SAS programs, SAS libraries, SAS files, SAS datasets, and SAS outputs. Lastly, the post discusses the properties and types of SAS files and datasets, along with their attributes and purposes.

What is SAS Programming?

SAS Programming is to use the SAS language for data management, data manipulation, and various data analysis tasks. It consists of outlines of SAS code to provide instructions to the SAS system for various data management or analytical tasks such as how to read, view, produce output, and manipulate or analyze the data.

What is the prerequisite for learning SAS Programming?

The only prerequisite for learning SAS Programming is to have basic Statistics and SQL skills that will help you to understand SAS more effectively.

Components of SAS

The key components of SAS are the following:

SAS PROGRAMS
SAS LIBRARIES
SAS FILES
SAS DATASETS
SAS OUTPUTS

SAS Programs

SAS programs are widely used for accessing, analyzing, managing or presenting data. SAS programs combine DATA STEP and PROC (procedure) steps.

Data Steps

The primary method for creating a SAS dataset is using the data step. The data step begins with the DATA statement and contains other programming statements. You can use programming statements to manipulate the existing SAS dataset or create a new dataset from raw data files.

Proc Steps

PROC steps are pre-defined procedures in SAS used to analyze and process data in a SAS data set. PROC steps are also used for data munging, transforming data from one “raw” data form into another format.

Create new SAS data sets
List, sort, and provide summaries of data.
Generate statistical results.
Generate plots and charts.

SAS Libraries

Libraries are collections of files that are accessible by the SAS system. A library can be a physical or logical collection of files. SAS datasets are stored in SAS Libraries. There are 2 types of SAS Libraries.

Temporary Libraries
Permanent Libraries

Temporary Libraries

Temporary libraries in SAS are volatile, meaning any SAS files or datasets stored in the temporary library will be available only for the current SAS session and removed once the SAS session is closed. By default, SAS creates files in a temporary library known as WORK.

Permanent Libraries

A permanent SAS library refers to the files stored on your computer’s external storage medium. These files or datasets are not deleted when the SAS session terminates. You can work with the files in a permanent SAS library by specifying the libref as the first part of a two-level SAS filename.

Predefined SAS Libraries

By default, SAS has several libraries that are listed below.

Sashelp

SASHELP is a Read-only permanent library that contains sample data and other files that control how SAS works.

Sasuser

SASUSER is a permanent library that contains SAS files in the Profile catalog and stores your settings. You might not have write access to the Sasuser directory using SAS Studio or SAS University Edition. To verify whether you have Write access, run the code below.

proc options option=rsasuser;
run;

If the result from the PROC OPTIONS procedure is NORSASUSER, then sasuser folder is writable. If the result from the PROC OPTIONS code is RSASUSER, then sasuser folder is Read-only.

Work

WORK is a temporary library for files that do not need to be saved from session to session. You can also define additional libraries. When you define a library, you indicate the location of your SAS files to SAS. After you define a library, you can manage SAS files within it.

SAS FILES

The SAS System creates and uses various structured files called SAS files. These files are stored in the SAS data libraries. Learn more about SAS Libraries in the article Working with SAS libraries.

Types of SAS Files

SAS files stored in SAS data libraries are referred to as library members. Each member has a member type. The SAS System uses unique file extensions to differentiate between SAS files and external Windows files in a folder.

The extensions and member types are listed in the table below.

SAS File Extension	Member	Description
.sas	.sas	SAS program
.lst	.lst	Procedure output
.log	.log	SAS log file
.sas7sdat	DATA	SAS data file
.sas7sndx	INDEX	Data file index; not treated by the SAS System as a separate file
.sas7scat	CATALOG	SAS catalog
.sas7spgm	PROGRAM	Stored program (DATA step)
.sas7svew	VIEW	SAS data view
.sas7sacs	ACCESS	Access descriptor file
.sas7saud	AUDIT	Audit file
.sas7sfdb	FDB	Consolidation database
.sas7smdb	MDDB	Multi-dimensional database
.sas7sods	SASODS	Output delivery system file
.sas7sdmd	DMDB	Data mining database
.sas7ssitm	ITEMSTOR	Item store file
.sas7sutl	UTILITY	Utility file
.sas7sput	PUTILITY	Permanent utility file
.sas7sbak	BACKUP	Backup file

SAS Datasets

A data set is one of the types of SAS files. Data sets must have a name. Valid data set names can be 1 to 32 characters long and must begin with a letter or an underscore.

SAS Data set has two parts:

Descriptive Portion
Data portion

Parts of SAS dataset — SAS Datasets Properties

Descriptive portion

A SAS data set consists of two parts: a descriptor portion and a data portion. A SAS data set can also point to indexes, which enables SAS to locate rows in the data set more efficiently.

Once the data is in the form of a SAS data set, there is no need to specify the attributes of the data set or the variables in your program statements. SAS can obtain the information directly from the data set.

The descriptor portion of a SAS data set contains information about the data set, including the following:

Name of the data set
Date and time that the data set was created
Number of observations
Number of variables

Data Portion

The Collection of data values in SAS is in a table format.

Observation (Rows) in SAS

Observations (also called rows) in a SAS data set are collections of data values that usually relate to a single object.

Each row is a collection of data values related to a single object. Each column is a collection of values describing a particular characteristic; for each variable in a data set, there are a series of attributes.

SAS Variables (Columns)

Variables (also called columns) in the data set are values describing a particular characteristic.

SAS Tutorials - Basics of SAS Programming — Variables (Columns) in SAS

Variable Attributes

The descriptor portion also contains information about the properties such as the variable’s name, type, length, format, informat, and label of each variable in the SAS data set.

Here is a listing of the attribute information in the descriptor portion of the SAS data set SASHELP.AIR

Variable Names

A variable name conforms to the SAS naming conventions and follows the same rules as SAS data set names.

Rules for Variable Names

They can be 1 to 32 characters long.
They must begin with a letter (A-Z, either uppercase or lowercase) or an underscore (_).
They can continue with any combination of numbers, letters, or underscores.

VALIDVARNAME= System Option

The VALIDVARNAME system option is set to V7 (letters of the Latin alphabet, numerals, or underscores) by default.

If you would like to use characters other than the valid ones, you must specify VALIDVARNAME=ANY.

If the name includes a per cent sign (%) or an ampersand (&), or even a variable name with spaces, you must use single quotation marks in the name literal.

Example : ‘% of profit’n=percent;

VALIDVARNAME specifies the rules for valid SAS variable names that can be created and processed during a SAS session.

Syntax, VALIDVARNAME=

VALIDVARNAME= V7|UPCASE|ANY

V7 specifies that variable names must follow these rules:

SAS variable names can be up to 32 characters long.
The first character must begin with a letter of the Latin alphabet (A – Z, either uppercase or lowercase) or an underscore (). Subsequent characters can be letters of the Latin alphabet, numerals, or underscores.
Trailing blanks are ignored. The variable name alignment is left-justified.
A variable name cannot contain blanks or special characters except for an underscore.
A variable name can contain mixed-case letters. SAS stores and writes the variable name in the same case used in the first reference to the variable. However, when SAS processes a variable name, SAS internally converts it to uppercase. Therefore, you cannot use the same variable name with a different combination of uppercase and lowercase letters to represent different variables. For example, cat, Cat, and CAT all represent the same variable.
You cannot assign variables with the names of special SAS automatic variables (such as _N and ERROR) or variable list names (such as NUMERIC, CHARACTER, and ALL) to variables.

UPCASE specifies that the variable name follows the same rules as V7, except that the variable name is uppercase, as in earlier versions of SAS.

ANY specifies that SAS variable names must follow these rules:

The name can begin with or contain any characters, including blanks, national, special, and multi-byte characters.
The name can be up to 32 bytes long.
The name cannot contain any null bytes.
Leading blanks are preserved, but trailing blanks are ignored.
The name must contain at least one character. A name with all blanks is not permitted.
A variable name can contain mixed-case letters. SAS stores and writes the variable name in
The same case is used in the first reference to the variable. However, when SAS processes a variable name, SAS internally converts it to uppercase. Therefore, you cannot use the same variable name with a different combination of uppercase and lowercase letters to represent different variables. For example, cat, Cat, and CAT all represent the same variable.

SAS Variable Types

Character Variables
Numeric Variables

Character variables can contain any value while numeric variables can contain only numeric values.

Missing values are represented as a blank in Character variables, whereas a period represents a missing value for numeric values.

Length of SAS Variable

Character variables can be up to 32,767 bytes long, and Numeric variables have a constant default length of 8.

Format of SAS Variable

Formats are the variable attributes that affect how data values are written or control how the value is displayed.

SAS has character, numeric and, date and time formats. You can also create and store your formats. You must select the appropriate format to write values using a particular form.

For example, to display the value 1234 as $1,234.00 in a report, you can use the
DOLLAR9.2 format.

You have to specify the maximum width (w) of the value to be written and (d) is the number of decimal places.

For example, to display the value 5678 as 5,678.00 in the output, you can use the COMMA8.2 format, which specifies a width of 8, including two decimal places.

Informat in SAS

Informats read data values in certain forms into standard SAS values. It determines how data values are read into a SAS data set.

You must use informats to read numeric values that contain letters or other special characters.

For example, the numeric value $12,345.00 contains two special characters, a dollar
sign ($) and a comma (,).

In this case, you can use an informat to read the value while removing the dollar sign and comma and then store the resulting value as a standard numeric value.

Read: Ultimate Guide to SAS Formats and Informats

SAS Variable Labels

Labels are a variable’s attributes, consisting of descriptive text up to 256 characters long.

SAS Indexes

An index is a separate file that you can create for a SAS data file to provide
direct access to a specific observation.

The index file has the same name as its data file and a member type of INDEX. Indexes can provide faster access to specific observations, particularly when you have a large data set.

The purpose of SAS indexes is to optimise WHERE expressions and facilitate BY-group processing.

Extended attributes

Extended attributes are defined on a data set or on a variable that is user-defined metadata. For example, you can store a description of a variable or the formula used to produce the variable value or save a URL that specifies information about your data set.

The XATTR ADD Statement adds extended attributes to variables or data sets.

To Store a name: value pairs with a data set:

XATTR ADD DS attribute_name = attribute_value;

To Store a name: value pairs with a variable:

XATTR ADD VAR variable_name(attribute_name=attribute_value);

SAS Outputs

SAS output is the result of executing SAS programs. Most SAS procedures and some DATA step applications produce output.

There are three types of SAS output:

SAS Log file
SAS Procedure output file
SAS console log file

SAS Log file

The SAS log file is generated when a SAS program is executed. It contains information about the processing of SAS statements. Notes are written to the SAS log, and any applicable error or warning messages as each program step executes.

SAS Procedure Output File

SAS sends the output to the procedure output file whenever a SAS program executes a PROC step. SAS procedure output is handled by the Output Delivery System (ODS).

SAS Console Log File

The console log is used when an error, warning or note has to be written to the SAS log, but the log is not available.

Conclusion

In conclusion, SAS programming is a powerful tool for data management, manipulation, and analysis. This article covered the basics of SAS, including the key components of SAS, such as SAS programs, SAS libraries, SAS files, SAS datasets, and SAS outputs.

We also discussed the prerequisites for learning SAS programming and the different types of SAS files. With this knowledge, you can get started on your journey to becoming a proficient SAS programmer.

SAS Tutorials – Basics of SAS Programming

What is SAS Programming?

What is the prerequisite for learning SAS Programming?

Components of SAS

SAS Programs

Data Steps

Proc Steps

SAS Libraries

Temporary Libraries

Permanent Libraries

Predefined SAS Libraries

Sashelp

Sasuser

Work

SAS FILES

Types of SAS Files

SAS Datasets

Descriptive portion

Data Portion

Observation (Rows) in SAS

SAS Variables (Columns)

Variable Attributes

Variable Names

Rules for Variable Names

VALIDVARNAME= System Option

SAS Variable Types

Length of SAS Variable

Format of SAS Variable

Informat in SAS

SAS Variable Labels

SAS Indexes

Extended attributes

SAS Outputs

SAS Log file

SAS Procedure Output File

SAS Console Log File

Conclusion

Subhro

Leave a ReplyCancel reply

What is SAS Programming?

What is the prerequisite for learning SAS Programming?

Components of SAS

SAS Programs

Data Steps

Proc Steps

SAS Libraries

Temporary Libraries

Permanent Libraries

Predefined SAS Libraries

Sashelp

Sasuser

Work

SAS FILES

Types of SAS Files

SAS Datasets

Descriptive portion

Data Portion

Observation (Rows) in SAS

SAS Variables (Columns)

Variable Attributes

Variable Names

Rules for Variable Names

VALIDVARNAME= System Option

SAS Variable Types

Length of SAS Variable

Format of SAS Variable

Informat in SAS

SAS Variable Labels

SAS Indexes

Extended attributes

SAS Outputs

SAS Log file

SAS Procedure Output File

SAS Console Log File

Conclusion

Subhro

Leave a ReplyCancel reply

You Might Also Like

History of SAS software

SAS on demand for Academics – Complete Guide for using SAS on the cloud.

How to learn SAS Programming online for free?