A Comprehensive Guide To PROC SQL In SAS (15 + Examples)

PROC SQL is a SAS Procedure that combines the functionality of DATA and PROC steps into a single step.

PROC SQL can sort data, create summaries of data, subsetting, join (merge), concatenate datasets, create new or calculated variables, print the results, create a new table, or view all in a single step.

PROC SQL in SAS can be used to retrieve, update, and report on information from SAS data sets or other database products.

The SELECT Statement and its Clauses

The purpose of a SELECT the statement is to retrieve data from underlying tables. Although it supports multiple clauses, the SELECT statement has only one clause which is required to be specified – FROM clause. All the remaining clauses are optional and can be used when needed.

To prevent syntax errors from occurring when using the SELECT statement, the clauses must be specified in the correct order as below.

PROC SQL options;
 SELECT column(s)
  FROM table-name | view-name
   WHERE expression
    GROUP BY column(s)
     HAVING expression
      ORDER BY column(s);

Example

proc sql;
 select Origin, Make, Model, DriveTrain, MPG_CITY, count(Model) as ModelCount 
  from sashelp.cars where Drivetrain='All' group by Make having MPG_CITY > 20 
  order by origin, make;
quit;

Ordering outputs by column

The PROC SQL can impose order in a table using the ORDER BY clause. It orders the query results according to the values in one or more selected columns specified after the FROM clause.

Rows of data can be ordered in ascending (default) or descending (DESC) for each column specified.

proc sql;
 select * from sashelp.class order by sex, 3, weight desc;
quit;

A comprehensive guide to PROC SQL in SAS (15 + Examples)

In the above example, the output is first arranged by sex, then within sex by height(3 specifies the column ordinal position), and within height in descending order by weight.

Grouping data using summary functions

The GROUP BY clause is used to aggregate and order data groups using a designated column of the same value.

proc sql;
 select min(low) as Lowest, max(high) as highest, stock from sashelp.stocks 
  group by stock;
quit;

Grouping data and sorting

Without the ORDER BY clause, the SQL procedure automatically sorts the results from a GROUP BY query in the same order as specified in the GROUP BY clause.

No additional processing occurs to satisfy the request when the ORDER BY and GROUP BY are specified for the same column.

The Group BY clause first sorts the result in the grouping column and then aggregates the query rows by the same grouping column.

If the columns specified in the ORDER BY and GROUP BY are not the same, additional processing may be required, which involves remerging summary statistics with the original data.

proc sql;
 select min(low) as Lowest, max(high) as highest, stock from sashelp.stocks 
  group by stock order by lowest;
quit;

Subsetting groups with the having clause

Using the combination of GROUP BY and HAVING clause, aggregated data can be filtered out from one group instead of one row at a time.

Suppose you would like to find only those product grouping with an average price greater than 100 from the price data table.

A HAVING clause restricts the number of selected groups and is always performed after the GROUP BY clause.

proc sql;
 select productname, avg(price) from sashelp.pricedata group by productname 
  having avg > 100;
quit;

Finding duplicate and Unique values

To select only one of several identical values in a table, SAS supports and processes the DISTINCT keyword and the UNIQUE constraint.

proc sql;
 select distinct make from sashelp.cars;
 run;

To find the count of unique values, you can use the distinct and count function together as below.

proc sql;
 select count(make), count(distinct make) from sashelp.cars;
 run;

The main difference between UNIQUE and DISTINCT in PROC SQL is that Unique ensures that all the values in a column are different.

In contrast, Distinct removes all the duplicate records when retrieving the records from a table. However, Unique can also retrieve unique records from a table.

proc sql;
 create table subjects(subjid int unique , subject_name varchar(255), 
  subject_age int);
quit;

Since the Unique constraint is defined for Subjid, we cannot insert the same value for subjid.

proc sql;
 insert into subjects values (8, "Subject1", 37);
quit;

proc sql;
 insert into subjects values (8, "Subject1", 37);
quit;

ERROR:Add/Update failed for data set WORK.SUBJECTS because data value(s) do not comply
with integrity constraint UN0001.

Counting the number of missing Values

The <a class="rank-math-link" href="https://9to5sas.com/sas-numeric-functions/#NMISS_function">NMISS</a> function of PROC SQL in SAS is used to find the count of missing values in a column. The advantage of this function is that it works both for the character and numeric data types.

select count(*) as count, nmiss(agechddiag) as MissingCount from sashelp.heart;
quit;

Character Operations and Functions

Character operations are used with character data. You can also use most data step character functions within PROC SQL.

The below example uses the || – Contanation operator to join two rows, Make and Model.

proc sql outobs=5;
 select Make || "," || Model as MakeAndModel from sashelp.cars;
quit;

Finding the length of string using Proc SQL

The LENGTH function is used to find the length of the character string in Proc SQL. It returns the number equal to the number of characters in the argument.

proc sql outobs=5;
 select model, length(Model) as Modellength from sashelp.cars;
quit;

Finding the occurrence of a pattern using the Index

To find the occurrence of a pattern, the <a class="rank-math-link" href="https://9to5sas.com/character-function-in-sas/#INDEX">INDEX</a> function is used. The character string passed in the argument is searched from left to right for the first occurrence of the specified value.

If the string is found, the column position of the first character is returned. Otherwise, a value of 0 is returned.

In the below example, we can find cars with a manual transmission by writing the following code. Also, note that the search is case-sensitive.

proc sql outobs=5;
 select make, model from sashelp.cars where index(model, 'manual') > 0;
quit;

Extracting information from a string

Sometimes we may need to extract specific information from a string. We can use the <a class="rank-math-link" href="https://9to5sas.com/character-function-in-sas/#SUBSTR">SUBSTR</a> function with a character column by specifying a starting position and the number of characters to extract.

We can extract the first and last names from the name column in the following example by writing the following code.

proc sql outobs=5;
 select name,team, 
 substr(name,index(name,',')+1,length(name)) as FirstName,
 substr(name,1,index(name,',')-1) as LastName 
  from sashelp.baseball;
quit;

For extracting the first name, the INDEX function is used with SUBSTR to find the position of “,” in the Name column.

The value returned is then passed as the second argument to the SUBSTR function specifying the character’s starting position to be extracted.

Similarly, for extracting the last name, the INDEX function is used in the last argument of the SUBSTR function, which will return the ending position of “,”.

Phonetic matching – Sounds like operator (=*).

With sounds like operator, we can find names that sound alike or have spelling variations. This is performed using the Soundex algorithm.

proc sql;
 select name,team,substrn(name,index(name,',')+1,length(name)) as 
 FirstName  from sashelp.baseball
  where calculated firstname =* ' Andre'
;
quit;

Finding the first non-missing value

The COALESCE function is used to find the first non-missing value in a column or list. It scans the argument from left to right and returns the first non-missing or null value. If all the values are missing, the result is also missing.

data custDetails;
 input custid Mobile Landline;
 datalines;
001 9800231356 9800231356
002 8800230356 .
003 . 9733231250
;
run;

proc sql;
 select custid, coalesce(mobile, landline) format=best32. as Phone from 
  custDetails;
quit;

In the above example, we are instructing SAS to show the Landline Number of the customer; if the Mobile number column is missing, show the Mobile number.

Using the COALESCE function, you can process these data types with a single function call instead of writing multiple IF-THEN/ELSE statements.

Producing a ROW number

A unique feature of PROC SQL in SAS is the ability to obtain the row count using the MONOTONIC() function, which is similar to the _n_ automatic variable in a data step.

proc sql outobs=5;
 select monotonic() as Row_Numbers, * from sashelp.class;
 run;

A row number can also be produced with the documented SQL procedure NUMBER. Unlike the MONOTONIC(), the NUMBER option does not create a new column in the table.

proc sql outobs=5 number;
 select * from sashelp.class;
 run;

Replicate a data set without data

Using PROC SQL in SAS, it is very easy to create a new empty data set while keeping all the structure of the original data set.

proc sql;
 create table class1 like sashelp.class;
quit;

The Takeaway:

So, this was our side on Proc SQL in SAS. We hope that you must have found it useful. Moreover, if you have other suggestions, mention them in the comment section below. We would take those lists in our further blog post. Thanks for reading!

If you liked this article, you might also want to read Everything You Need to Know About SQL SET Operators and Intermediate Proc SQL Tutorials with Examples.

Do you have any tips to add? Let us know in the comments.

Please subscribe to our mailing list for weekly updates. You can also find us on Instagram and Facebook.

A comprehensive guide to PROC SQL in SAS (15 + Examples)

The SELECT Statement and its Clauses

Ordering outputs by column

Grouping data using summary functions

Grouping data and sorting

Subsetting groups with the having clause

Finding duplicate and Unique values

Counting the number of missing Values

Character Operations and Functions

Finding the length of string using Proc SQL

Finding the occurrence of a pattern using the Index

Extracting information from a string

Phonetic matching – Sounds like operator (=*).

Finding the first non-missing value

Producing a ROW number

Replicate a data set without data

Subhro

Leave a ReplyCancel reply

The SELECT Statement and its Clauses

Ordering outputs by column

Grouping data using summary functions

Grouping data and sorting

Subsetting groups with the having clause

Finding duplicate and Unique values

Counting the number of missing Values

Character Operations and Functions

Finding the length of string using Proc SQL

Finding the occurrence of a pattern using the Index

Extracting information from a string

Phonetic matching – Sounds like operator (=*).

Finding the first non-missing value

Producing a ROW number

Replicate a data set without data

Subhro

Leave a ReplyCancel reply

You Might Also Like

Descriptive Statistics in SAS with Examples

SAS Loops Explained

Modifying SAS Dataset using SAS Modify Statement