Count of missing and nonmissing values for each variable in a SAS data set

Count missing and Non-missing values for each variable – In SAS, we often need to get the count of missing and non-missing values in a SAS dataset. The code in this example uses PROC FORMAT to create the format for character and numeric variables to be either “non-missing” or “missing” and then use that format with PROC FREQ.

We have used SASHELP.HEART dataset as an example.

We then get a compact table showing us the number of missing and non-missing for the variable type.

Below are the steps to help you understand the working of this macro.

Step 1. Create a user-defined format for the group missing and non-missing values.

proc format;
value $missfmt ' '='Missing Values' other='Non-Missing';
value missfmt .='Missing Values' other='Non-Missing';
run;

Step 2.  Now, let’s create a macro variable with the count of variables in the dataset.

%let dsid=%sysfunc(open(SASHELP.HEART));   /*Open the dataset*/
%let cntvar=%sysfunc(attrn(&dsid, nvars)); /* The value of cntvar is 17*/

Step 3. Get the list of all variables with their type. I have used the PROC CONTENTS procedure to get the variable names and their type in a SAS dataset.

proc contents data=SASHELP.HEART varnum out=var(keep=name type) noprint;
run;

Below is the results of the output dataset generated using the PROC CONTENT Procedure.

Results of PROC CONTENTS
Results of PROC CONTENTS

Step 4. Now, we can create macro variables for each value of the SAS dataset.

data _null_;
		set var;
		suffix=put(_n_, 5.);
		call symputx(cats('Name', suffix), Name);
		call symputx(cats('Type', suffix), Type);
run;

Step 5. The next step is to find the count of missing and non-missing values using the PROC FREQ procedure.

For simplicity, we have taken only one variable from the dataset.

Notice the /missing keyword. This tells SAS to generate the frequency count of missing values as well.

PROC FREQ

Step 6. The next step is to apply the user-defined format to group the missing and non-missing values.

%macro test;
	proc freq data=SASHELP.HEART;
		tables AgeAtDeath /missing nopercent nocum nofreq nopercent 
			out=out1(drop=percent rename=(AgeAtDeath=value));
		format AgeAtDeath 
		%if type=2 %then
			%do;
				$missfmt. %end;
		%else
			%do;
				missfmt. %end;
		;
	run;
%mend;
%test;
Missing values

Step 7. Enclose the above steps inside a macro, and a do loop to get the variables’ data. The results will be as below.

PROC FREQ

Step 8. Now, add the variable name as a value in each dataset to combine and create a single dataset.

data out&i; 
set out&i;
length varname $32.; 
varname="&&name&i";

Step 9. The below step combines all the datasets using the set statement.

data final;
set %do i=1 %to &cntvar;
out&i 
%end;
run;
Combined dataset

The combined dataset looks as below.

Step 10. Now, reshape the combined dataset using PROC TRANSPOSE.

proc transpose data=final out=combine(drop=_:);
		by varname;
		id value;
		var count;
	run;
sas count missing values for each variable

here is the final code…

data hearts;
	set sashelp.heart;
run;

proc format;
value $missfmt ' '='Missing Values' other='Non-Missing';
value missfmt .='Missing Values' other='Non-Missing';
run;

%macro varCounts (dsn=);
%let dsid=%sysfunc(open(&dsn));
%let cntvar=%sysfunc(attrn(&dsid, nvars));
%put &dsid;
%put &cntvar;
proc contents data=&dsn varnum out=var(keep=name type) noprint;
run;

proc print data=var;
data _null_;
set var;
suffix=put(_n_, 5.);
call symputx(cats('Name', suffix), Name);
call symputx(cats('Type', suffix), Type);
run;

%do i=1 %to &cntvar;
proc freq data=&dsn noprint;
tables &&name&i /missing nopercent nocum nofreq nopercent
out=out&i(drop=percent rename=(&&name&i=value));
format &&name&i
%if &&type&i=2 %then
%do;
$missfmt. %end;
%else
%do;
missfmt. %end;
;
run;

data out&i;
set out&i;
varname="&&name&i";
%if &&type&i=1 %then
%do;
value1=put(value, missfmt.);
%end;
%else %if &&type&i=2 %then
%do;
value1=put(value, $missfmt.);
%end;
drop value;
rename value1=value;
run;
%end;
data final;
set %do i=1 %to &cntvar;
out&i %end;
;
run;

proc transpose data=final out=combine(drop=_:);
by varname;
id value;
var count;
run;

proc print data=combine;
%mend;
%varCounts(dsn=hearts);

Download this SAS file.

Every week we'll send you SAS tips and in-depth tutorials

JOIN OUR COMMUNITY OF SAS Programmers!

Subhro

Subhro provides valuable and informative content on SAS, offering a comprehensive understanding of SAS concepts. We have been creating SAS tutorials since 2019, and 9to5sas has become one of the leading free SAS resources available on the internet.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This Post Has 4 Comments

  1. sasuser

    Very Helpful!

    1. Subhro

      Thank you!!

  2. sasser

    Hi there, is there any way to prevent the ‘varname’ from being truncated?

    1. Subhro Kar

      yes you need to define the length for varname to maximum value you think is possible.