8 Ways to count the number of observations in a SAS dataset

8 Ways to count the number of observations in a SAS dataset and pass it into a macro variable

SAS data sets consist of a descriptor portion and a data portion that contains the data values. The descriptor portion of a SAS data set holds the detailed information about the data set. This information includes:

  • Dataset name and its ember type
  • Creation time of the dataset
  • The number of observations
  • The number of variables
  • The engine type

Most of the times we need to count the numbers of observations in a SAS dataset and pass it to a macro variable. Reading the descriptor portion is one of the most efficient and quickest ways of determining the number of observations in a SAS data set.

In this post, we will see various methods to count the number of rows (records) in a SAS dataset.

8 Ways to count the number of observations in a SAS dataset and pass it into a macro variable

1. Using PROC SQL

proc sql;
	select count(*) into :cnt from sashelp.class;
quit;

%put &cnt.;

Using the PROC SQL method is not considered to be an efficient way as it does not use metadata information of the SAS dataset. Instead, it reads through each record (row) of your SAS dataset which requires processing power. However, it is simple to understand and develop and can be used for smaller datasets.

 2. Using END= Statement

data _null_;
		set sashelp.class end=eof;
		count+1;
		if eof then
			call symput("nobs", count);
	run;
%put &nobs;

This method is also not efficient. It reads the entire data set and increments a counter to pick up the last value of the dataset. The END option returns true or 1 if the observation is the last observation in the dataset.

3. Using the Data Step

data _null_;
	set sashelp.class nobs=obs;
	call symputx('nobs', obs);
run;

%put &nobs.;

NOBS is a SAS automatic variable that contains the number of rows in a dataset and NOBS = obs holds the count of records in the variable obs.

CALL SYMPUTX  is a DATA Step call routine that assigns a value produced in a DATA step to a macro variable.

 4. Using IF 0 and STOP statement

data _NULL_;
	if 0 then
		set sashelp.class nobs=n;
	call symputx('totobs', n);
	stop;
run;
%put no. of observations = &totobs;

During the compilation phase, the data step reads in variables from the data set in the Set Statement. During execution, SAS reads in the observations from the input data set in sequential order. By using if 0 then set, we can bypass the execution part of the Set Statement which is conditional logic that always fails.

The ‘if 0‘ statement is not processed at all because the IF statement does not hold TRUE. The whole IF-THEN statement is used to pull the header information of the data set and later pass it to the compiler to adjust it to the PDV.

If 0 Then Set can also be coded as If (1=2) Then Set

The STOP statement is used to stop an endless loop.

Read – Exploring the SET Statement in SAS

5. Proc SQL Dictionary Method

proc sql noprint;
 select nobs into :totobs separated by ' ' from dictionary.tables
 where libname='SASHELP' and memname='CARS';
quit;
%put total records = &totobs.;

The metadata information of a dataset can be accessed with PROC SQL Dictionary.Tables.It is an efficient method as it does not look into each value of a dataset to determine the count.

The LIBNAME= refers to the name of the library in which data is stored. The MEMNAME= refers to the SAS table (dataset). The separated by ‘ ‘ is used in this case to left-align the numeric value.

6. Using the Library table – SASHELP.VTABLE

data _null_;
	set sashelp.vtable;
	where libname="SASHELP" and memname="CLASS";
	Num_obs=Nobs-Delobs;
	put "Nobs - Delobs: num_obs = " num_obs;
	call symputx('obs_cnt',num_obs);
run;
%put &obs_cnt;

As we know that SAS library tables contain metadata relating to your data sets, and we can get this information from the SASHELP.VTABLE by specifying the LIBNAME and the name of the data set.

The two variables you need from the table are Nobs which contains the total number of observations including ones marked for deletion and Delobs contains only the number of observations marked for deletion.

This method of determining the number of observations in a SAS data set has an advantage over the previous methods described so far. That is if you (or other people) are modifying a data set, you need to know the total number of observations in a data set as well as the number of observations that have been marked for deletion (but are still counted when you use the NOBS= SET option).

7. Using PROC SQL automatic variable – SQLOBS

proc sql noprint;
		select * from sashelp.class;
		run;
%put &sqlobs.;

Proc SQL automatically creates SALOBS macro variable, when it runs a SQL Step. SQLOBS macro variable will have the number of observations count of the last proc SQL statement executed.

 8. Using Data Access Functions

%macro totobs(mydata);
    %let mydataID=%sysfunc(OPEN(&mydata.,IN));
    %let NOBS=%sysfunc(ATTRN(&mydataID,NOBS));
    %let RC=%sysfunc(CLOSE(&mydataID));
    &NOBS
%mend;
%put %totobs(sashelp.cars);

References

How Many Observations Are In My Data Set?

Every week we'll send you SAS tips and in-depth tutorials

JOIN OUR COMMUNITY OF SAS Programmers!

Subhro

Subhro provides valuable and informative content on SAS, offering a comprehensive understanding of SAS concepts. We have been creating SAS tutorials since 2019, and 9to5sas has become one of the leading free SAS resources available on the internet.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.