7 PROC SQL Options You Should Use to Debug Queries

PROC SQL is a powerful tool, but it can be challenging to debug queries. This blog post will discuss some Proc SQL options to debug queries.

1. INOBS= and OUTOBS= Options

INOBS – The INOBS= option limits the number of rows that PROC SQL processes. This option can be handy when developing a new query and testing on a limited number of records. 

proc sql inobs=5;
	select * from sashelp.heart;

The SAS log displays a warning when you use the INOBS option.

WARNING: Only 5 records were read from SASHELP.HEART due to INOBS= option.

OUTOBS – The OUTOBS= option restricts the number of rows PROC SQL displays or writes to a table. For example, if you specify OUTOBS=10 and insert values into a table using a query, then PROC SQL inserts 10 rows into the resulting table. OUTOBS= is similar to the SAS data set option OBS=.

proc sql outobs=6;
	select * from sashelp.heart;

The SAS log displays a warning when you use the OUTOBS option.

WARNING: Statement terminated early due to OUTOBS=6 option.

INOBS and OUTOBS might not appear to differ in a simple query. However, sometimes it is essential to choose the correct option. For example, the average of a column with INOBS=10 returns only 10 values.

2. Limiting Iterations with the LOOPS= Option

In PROC SQL, the LOOPS= option specifies how many iterations will be performed inside the inner loop. The limit can be set to prevent excessive computer usage. For example, joining three large tables without meeting the join-matching conditions could result in a sizeable internal table that would be inefficient to process.

proc sql loops=5;
	select p.prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.products p, sql2.manufacturers m where p.manunum=m.manunum;
quit;

The SAS log displays a warning when you use the Loops= option.

WARNING: PROC SQL statement interrupted by LOOPS=5 option.

3. Checking Syntax with the NOEXEC Option and the VALIDATE Statement

NOEXEC – By using the NOEXEC option or the VALIDATE statement, you can verify the syntax of a PROC SQL step without executing it.

The NOEXEC option can only be used once in a PROC SQL statement. This option will ensure that the syntax of all queries in that PROC SQL step is checked for accuracy without executing them.

proc sql noexec;
	select p.prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.products p, sql2.manufacturers m where p.manunum=m.manunum;
quit;

The SAS log displays the following message if the query is valid and all referenced columns and tables exist.

NOTE: Statement not executed due to NOEXEC option.

SAS displays the standard error messages in the log if there are any errors in the query.

When you invoke the NOEXEC option, SAS checks the syntax of all queries in that PROC SQL step for accuracy but does not execute them.

VALIDATE – SELECT statements are checked for accuracy by the VALIDATE statement before being executed. A message is written to the SAS log if the syntax is correct. An error message is displayed if the syntax is incorrect.

proc sql;
validate
select prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.products p, sql2.manufacturers m where p.manunum=m.manunum;
quit;

The SAS log displays the following message if the query is valid.

NOTE: PROC SQL statement has valid syntax.

SAS displays standard error messages in the log if there are errors in the query.

VALIDATE affects only the SELECT statement immediately following it, whereas the NOEXEC option affects the entire SELECT statement. If you are working with a PROC SQL query containing multiple SELECT statements, specify the VALIDATE keyword before each SELECT statement you want to check.

4. SELECT * with the FEEDBACK Option

The FEEDBACK option expands a SELECT * statement into all the selected variables, including any macro variables. This can be very useful to show the variable order when joining multiple tables. The SAS log displays the results of the expanded list of columns you have selected. 

proc sql feedback;
            select * from sashelp.class;

Expanded SELECT * Statement

proc sql feedback option

5. Reset option

The RESET statement adds, drops, or changes one or more PROC SQL options without restarting the procedure. This example shows the Print option is turned on and then off.

proc sql noprint outobs=5;
	select * from sashelp.class;
	reset print;
	select * from sashelp.class;
quit;
7 PROC SQL Options You Should Use to Debug Queries

6. Timing PROC SQL with the STIMER Option

There are multiple ways to accomplish certain operations. For example, a join often operates like a subquery. The query that runs fastest is generally more efficient based on factors such as readability and maintenance. A SAS system option, STIMER shows you an entire procedure’s cumulative time.

PROC SQL STIMER shows how fast individual statements run in a PROC SQL step. You can optimize your query this way.

The example below compares the execution times of two queries. Both queries list the product names and manufacturer numbers from the products table with a manufacturer number of 500. The first query does this with a join; the second with a subquery. Comparing the run times of two queries shows the STIMER results from the SAS log.

proc sql stimer;
	select prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.products p, sql2.manufacturers m where p.manunum=m.manunum and m.manunum=500;
	Select * from sql2.products where manunum in
(Select manunum from sql2.manufacturers where manunum=500);
Quit;

Log results:

7 PROC SQL Options You Should Use to Debug Queries
Comparing Run Times of Two Queries

A join query takes 0.01 seconds, while a subquery takes 0.05 seconds. A join runs faster than an equivalent subquery, regardless of many factors.

7. PROC SQL _METHOD Option

PROC SQL includes a powerful method option called METHOD. Since its implementation, many SAS ® SQL users have expressed very favourable comments for the value-added information it provides on the SAS Log. Furthermore, the METHOD option is well worth exploring due to the benefits of understanding the processes involved during specific PROC SQL operations, query evaluation, algorithm selection and use by the optimizer, or testing and debugging operations.

MSGLEVEL=I

By specifying MSGLEVEL= in an Options statement, users can control how much information the SAS System logs. MSGLEVEL= supports two possible values: N (default) for standard notes, warnings, and error messages; and I for details about sort, merge, and index processing.

Join algorithms

  • Nested Loop – The optimizer selects this algorithm when Sort-Merge, Index, and Hash are eliminated.
  • Sort-Merge – Used when the Index and Hash algorithms have been eliminated from consideration by the optimizer.
  • Index – The optimiser selects a user-defined index when performance will be enhanced.
  • Hash – Used by the optimizer when the smaller of the two tables can fit into the available memory

_METHOD

The _METHOD option returns several codes to the log. You can use these options to understand the choices made by the SQL optimizer, the index used, which joins to use and any sorting performed.

The SQL optimiser performs query operations using one of four join algorithms. The four algorithms include:

Code Description
SQXCRTA Create table as Select
SQZSLCT Select statement or clause
SQXISL Step loop join (Cartesian).
SQXIM Merge join operation.
SQXINDX Index join operation
SQXIHSH Hash join operation.
SQXSORT Sort operation.
SQXSRC Source rows from table.
SQXFIL Rows filtration.
SQXSUMG Summary stats (aggregates) with GROUP BY clause.
SQXSUMN Summary stats with no GROUP BY clause.
The _METHOD Option and Code Descriptions

Note: You need to specify the MSGLEVEL=I options to activate this setting.

These options can be used to debug any issues, for example, detect if the correct join strategy is being used, are my indexes being used in the join.

As part of the two-way equijoin, the _METHOD option is specified to show the processing hierarchy.

options msglevel=i;

proc sql _method;
	select p.prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.products p, sql2.manufacturers m where p.manunum=m.manunum and 
		p.manunum=500;
quit;

The SQL optimizer selected a hash join algorithm based on the join query.

7 PROC SQL Options You Should Use to Debug Queries

8. METHOD AND _TREE OPTIONS WITH MSGLEVEL=I

The _TREE option provides a graphical representation of how the query optimizer executed the query. This option shows the query plan based on several factors (estimated CPU and I/O). There may be more efficient plans. Nevertheless, TREE can help debug.

options msglevel=i;
proc sql _method _tree;
	select p.prodnum, p.prodname, m.manunum, m.manuname, m.manucity from 
		sql2.products p, sql2.manufacturers m where p.manunum=m.manunum and 
		p.manunum=500;
quit;
7 PROC SQL Options You Should Use to Debug Queries

Every week we'll send you SAS tips and in-depth tutorials

JOIN OUR COMMUNITY OF SAS Programmers!

Subhro Kar is an Analyst with over five years of experience. As a programmer specializing in SAS (Statistical Analysis System), Subhro also offers tutorials and guides on how to approach the coding language. His website, 9to5sas, offers students and new programmers useful easy-to-grasp resources to help them understand the fundamentals of SAS. Through this website, he shares his passion for programming while giving back to up-and-coming programmers in the field. Subhro’s mission is to offer quality tips, tricks, and lessons that give SAS beginners the skills they need to succeed.