Intermediate Proc SQL Tutorials with Examples

This article explains the data summarization by rows and columns and how you can use Predicates in PROC SQL to perform comparisons between two conditions or expressions.

Summarizing Data

The SQL procedure is a wonderful tool for summarizing (or aggregating) data. It provides several useful summary (or aggregate) functions to help perform calculations, descriptive statistics, and other aggregating operations in a SELECT statement or HAVING clause.

Proc SQL COUNT

The below example uses the COUNT function with the (*) argument to produce a total number of rows, whether data is missing or not.

The asterisk (*) is specified as the argument to the COUNT function to count all rows in the PURCHASES table.

proc sql;
select count(*) as row_count
from sashelp.shoes;
quit;

Unlike the COUNT(*) function syntax that counts all rows, whether data is missing or not, the following example uses the COUNT function with the (column-name) argument to produce a total number of non-missing rows based on the products column.

proc sql;
select count(product) as non_missing_row_count
from sashelp.shoes;
quit;

Proc SQL outobs and inobs options

You could use the PROC SQL options as an alternative to FIRSTOBS= and OBS=.

INOBS= is used to find out how many rows are to be processed.

OUTOBS= to decide how many rows to include in the report or output dataset.

proc sql inobs = 3 ;
  create table class_2 as
  select *
  from sashelp.class
  ;
quit ;
Intermediate Proc SQL Tutorials with Examples
It is critical to have a solid understanding of the timing involved in the selection, such as the code:
proc sql outobs = 3 ;
  create table class_1 as
  select *
  from sashelp.class
  where sex = 'M'
  ;
quit ;

proc sql inobs = 3 ;
  create table class_2 as
  select *
  from sashelp.class
  where sex = 'M'
  ;
quit ;

generates the following in the LOG:

Intermediate Proc SQL Tutorials with Examples
Intermediate Proc SQL Tutorials with Examples

This means that the INOBS= or OUTOBS= options are used to choose rows for the output dataset after the WHERE clause has chosen rows from an internal table.

Summarizing Data Down Rows

The SQL procedure can be used to produce a single aggregate value by summarizing data down rows (or observations).

Suppose you want to know the average sales for all Sport Shoes in the SASHELP.SHOES table containing a variety of products.

The following query computes the average product cost and produces a single aggregate value using the AVG function.

proc sql;
select avg(sales) as average_sales format=dollar10.2 from sashelp.shoes where
upcase(product) in ("SPORT SHOE");
quit;
Intermediate Proc SQL Tutorials with Examples

Summarizing data across columns

When a computation is needed on two or more columns in a row, the SQL procedure can be used to summarize data across columns.

Suppose you want to know the remaining inventory after-sales for each product. 

proc sql outobs=5;
	select product,sales,inventory, 
        (inventory-sales) as remaining_inventory format=dollar10. 
from sashelp.shoes;
quit;
Intermediate Proc SQL Tutorials with Examples

Predicates

Predicates are used in PROC SQL to perform direct comparisons between two conditions or expressions. Six predicates will be looked at:

  1. BETWEEN
  2. IN
  3. IS NULL
  4. IS MISSING
  5. LIKE
  6. EXISTS

Selecting a range of values

The IN predicate creates an OR condition between each value and returns a Boolean value of True if a column value equals one or more values in the expression list.

proc sql outobs=5;
select *
from sashelp.shoes
where upcase(product) = 'SANDAL';
quit;
Intermediate Proc SQL Tutorials with Examples

Testing for null or missing values

The IS NULL predicate selects one or more rows by evaluating whether a column amount is missing or null.

The IS MISSING predicate performs identically to the IS NULL predicate by selecting one or more rows containing a missing value (null).

data shoes;
set sashelp.shoes;
if _n_ in (4, 7, 8, 9) then
call missing(subsidiary);
run;

proc sql;
select * from shoes where subsidiary is null;
quit;
Intermediate Proc SQL Tutorials with Examples

The only difference is that specifying IS NULL is the ANSI standard way of expressing the predicate, and IS MISSING is commonly used in SAS.

See our guide on Working With Missing Values In SAS for more information on missing values in SAS.

Finding patterns in a string (pattern matching % and _)

Constructing specific search patterns in string expressions can be achieved with the LIKE Predicate.

The % (per cent sign) acts as a wildcard character representing any number of characters, including any combination of upper or lower case characters.

To find patterns in the product name (PRODUCT) containing the characters, ”Men’ in the first three positions followed by any number of characters is specified with the following WHERE clause.

proc sql outobs=5;
select *
from sashelp.shoes
where product like 'Men%';
quit;
Intermediate Proc SQL Tutorials with Examples

The wildcard character “%” precedes and follows the search word to select all products whose name contains the word “DRESS” in its name.

proc sql outobs=5;
select *
from sashelp.shoes
where upcase(product) like '%DRESS%';
quit;

The resulting output contains product types such as “Men’s Dress” and any other products containing the word “Dress”.

Intermediate Proc SQL Tutorials with Examples

LIKE predicate is used to check a column for the existence of trailing blanks.

% (Percent ) – The wildcard character % followed by a blank space is specified as the search argument.

_ (Underscore) –  The wildcard _ is used to search for a specific number of characters, and using the LIKE predicate, the underscore (_) provides a way to pattern match character-by-character.

Thus, a single underscore (_) in a specific position acts as a wildcard placement holder for that position only.

Two consecutive underscores (__) act as a wildcard placement holder for those two positions.

Three consecutive underscores act as a wildcard placement holder for those three positions. And so forth.

In the following example, the first position used to search product type contains the character “S”, and the following six positions (represented with six underscores) act as a placeholder for any value.

proc sql outobs=5;
select *
from sashelp.shoes
where upcase(product) like 'S______';
quit;
Intermediate Proc SQL Tutorials with Examples

The following example illustrates a pattern search of the product name (PRODUCT) where the first position is represented as a wildcard; the second position contains the lowercase character “o”, followed by any combination of uppercase or lowercase characters.

proc sql outobs=5;
select *
from sashelp.shoes
where product like '_o%';
quit;
Intermediate Proc SQL Tutorials with Examples

Testing for the existence of a value

The EXISTS predicate is used to test for the existence of a set of values.

In the following example, a subquery is used to check for the existence of customers in the CUSTOMERS table with purchases from the PURCHASES table.

proc sql;
select custnum, custname, custcity from customers c where exists
(select * from purchases p where c.custnum=p.custnum);
quit;
Intermediate Proc SQL Tutorials with Examples

The EXISTS condition is an operator whose right operand is a subquery. The result of an EXISTS condition is true if the subquery resolves to at least one row.

The result of a NOT EXISTS condition is true if the subquery evaluates to zero rows.

Download the entire code from this post here.

The Takeaway:

So, this was our side on some practical use of PROC SQL. We hope that you must have found it useful.

Moreover, if you have other suggestions, mention them in the comment section below. We would take those lists in our further blog post.

Thanks for reading!

If you liked this article, you might also want to read A Comprehensive Guide To PROC SQL In SAS (15 + Examples) and Everything You Need To Know About SQL SET Operators.

Do you have any tips to add? Let us know in the comments.

Please subscribe to our mailing list for weekly updates. You can also find us on Instagram and Facebook.

Every week we'll send you SAS tips and in-depth tutorials

JOIN OUR COMMUNITY OF SAS Programmers!

Subhro Kar is an Analyst with over five years of experience. As a programmer specializing in SAS (Statistical Analysis System), Subhro also offers tutorials and guides on how to approach the coding language. His website, 9to5sas, offers students and new programmers useful easy-to-grasp resources to help them understand the fundamentals of SAS. Through this website, he shares his passion for programming while giving back to up-and-coming programmers in the field. Subhro’s mission is to offer quality tips, tricks, and lessons that give SAS beginners the skills they need to succeed.