Proc Tabulate

Learn how to create Summary Reports using PROC Tabulate

  • Post author:
  • Post category:SAS/STATS
  • Post comments:1 Comment
  • Reading time:14 mins read

PROC TABULATE in SAS is used to display descriptive statistics in table format. PROC TABULATE has many statements that define how this procedure will summarize the data.

PROC MEANS is used to summarize numerical data, and PROC FREQ is used to obtain summaries for categorical data.

The PROC TABULATE procedure is a substitute for PROC MEANS and PROC FREQ.

The PROC TABULATE procedure must have a CLASS statement to specify any categorical variables to summarize. Any numerical variables must be included in the VAR statement of PROC TABULATE.

Proc Report and Proc Tabulate are both used to create summary tables. Although, Proc Report can create detail tables with COMPUTE Blocks which cannot be done using PROC Tabulate.

TABULATE produces three dimensions of reports: PAGE, ROW and COLUMN dimensions, using a TABLE statement along with a CLASS and VAR statement, which is used for calculating a summary function.

In PROC REPORT, everything is done with the COLUMN and DEFINE statements, along with COMPUTE blocks.

With PROC REPORT, you can have custom lines between groups, for example, “TOTAL SALES of SUV cars in USA Region”, whereas TABULATE Procedure does not have this feature.

For more details on which procedure to use in which scenarios, you can read the article on PROC TABULATE versus PROC REPORT.

The three statements are necessary for every Proc Tabulate step.

CLASS The CLASS statement is used to form groups within rows and columns.
VAR The VAR statements include the numeric variables which need to be summarized.
TABLE The TABLE statements contain the table definition.

The TABLE statement is the most important statement for the PROC TABULATE step. It has dimensions and definitions within those dimensions.

The tables generated by TABULATE can have up to three dimensions to their definition: page, row, and column.

These dimensions always appear in page, row, and column order.

PAGE Pages define how the individual pages are formed.
ROW It is the most used dimension and defines the table’s rows within each page.
COLUMN Columns are always present and define the columns within rows and pages.

There has to be at least one column dimension, and you cannot have a page dimension without having both row and column dimensions.

The general syntax of the TABLE statement is

table page, row, column;

To build the individual page with row and column dimensions, you must use a combination of options or elements. The three types of options are:

SINGULAR It is used when a single element is needed.
CONCATENATED Using this option, you can join or concatenate multiple elements using a space.
NESTED One element is nested within another to form a hierarchy using an asterisk.

Singular Elements

A singular element consists of a single variable. A basic example of a singular table is below.

title "Single Column Table Example";
proc tabulate data=sashelp.class; 
class sex; 
var weight;
table sex,weight;  
run; 
Proc Tabulate Single Column
Proc Tabulate Single Column Example

The analysis variable, Weight, is specified in the VAR statement and a single column.
Since there is no statistic specified, the default statistic of SUM is displayed.

Concatenated Elements

Using the concatenated statement, you can combine multiple elements within columns and/or rows. A concatenated definition is formed when two or more space-separated elements are included in the same dimension.

<span class="token keyword">title</span> <span class="token string">"Concatenated Elements Example";</span>proc tabulate <span class="token keyword">data=</span>sashelp<span class="token punctuation">.</span>cars <span class="token format"><span class="token keyword">format=dollar8.</span>;</span> <span class="token keyword">class</span> Origin Type<span class="token punctuation">;</span> <span class="token keyword">var</span> MSRP INVOICE<span class="token punctuation">;</span> <span class="token keyword">table</span> Origin Type<span class="token punctuation">,</span> MSRP INVOICE<span class="token punctuation">;</span> <span class="token step keyword">run;</span>
Concatenated Elements in Proc Tabulate
Concatenated Elements in Proc Tabulate

The above example has two classification variables(Origin and Type). The label associated with each analysis variable is used in the column header by default.
The analysis and classification variables can be used in page, row, or column dimensions.

Nested Elements

Nested definitions allow us to create tables within tables. The nested elements can be classification variables, analysis variables, statistics, options, and modifiers.

These are designated as nested elements through the use of the asterisk.

title "Nested Elements Example";
proc tabulate data=sashelp.cars; 
class ORIGIN TYPE; 
var MSRP; 
table ORIGIN,TYPE*MSRP*(N max min);
run;
Nested Elements in Proc Tabulate
Nested Elements in Proc Tabulate

In the above example. the row dimension(ORIGIN) is singular, and the column dimension (TYPE) has the analysis variable MSRP nested within a classification variable TYPE.

Three space-separated statistics are concatenated into a group with parenthesis and then the group is nested under the variable TYPE.

Combination of Elements

TABLE statement can contain a combination of nested and concatenated elements.
These will include not only variables and statistics but options as well.

proc tabulate data=sashelp.cars;
 class Origin Type;
 var MSRP;
 table Type='Type of cars' all='Total' origin all='Total', 
  msrp='Price in dollars'*(n max*f=dollar8. min*f=dollar8.
 mean) /box='Type of cars across regions';
 keylabel n='Number of cars' max="Maximum Price" min='Minimum Price' 
  mean='Average Price';
run;
Combination of Elements in Proc Tabulate
Combination of Elements in Proc Tabulate

The table definition has two concatenated elements(Type and Origin) in the row dimension.
The ALL keyword summarizes across the associated elements: Type and Origin.

Dollar Format(f=dollar8.) is used for Max and Min Statistics.

The BOX= option adds the text in the upper left corner of the table.

The KEYLABEL statement allows you to assign a text label to statistics and the keyword ALL.

Calculating Percentages Using PROC TABULATE

The PCTN and PCTSUM options request the calculation of percentages based on the denominator specified using angle brackets.

PCTN calculates the percentages based on counts (N), while PCTSUM calculates the percentages based on the total of an analysis variable.

Percentage based on counts

<span class="token step keyword">proc tabulate</span> <span class="token proc-args"><span class="token arg keyword">data=sashelp.cars;</span></span> <span class="token keyword">class</span> origin drivetrain<span class="token punctuation">;</span> <span class="token keyword">table</span> origin<span class="token punctuation">,</span>drivetrain <span class="token operator">*(</span>n pctn<span class="token operator"><</span>drivetrain<span class="token operator">>='%')</span> all<span class="token punctuation">;</span>run<span class="token punctuation">;</span>
Percentages Using PROC TABULATE
Percentages Using PROC TABULATE

Within each value of ORIGIN, the percentage of observations for each value of DRIVETRAIN is calculated.

Since PCTN is nested within ORIGIN, the denominator(DRIVETRAIN) is the total count for that ORIGIN value.

From the above table, we can interpret that 21.52% (out of 158 ) of All-Wheel drive is from Asia.

Percentage Generation Statistics

Instead of providing denominators in angular brackets, you can also use several percentage generation statistics. For these statistics, the denominator is predetermined.

The denominator can be based on a report, page or column.

Percentage applies to: Per cent Frequency Per cent Total
Report reppctn reppctsum
Page pagepctn pagepctsum
Column colpctn colpctsum
Row rowpctn rowpctsum
proc tabulate data=sashelp.cars(where=(upcase(type) in ('SUV','SEDAN','SPORTS')));
class origin drivetrain type;
tables origin*(type all)all,
drivetrain*(n colpctn rowpctn pctn<type all>="Type PCTN" pctn<drivetrain>="DriveTrain PCTN") all(n colpctn rowpctn pctn<type all>);
run;
Learn how to create Summary Reports using PROC Tabulate

Percentage based on the total analysis variable

Proc Format;
inValue status 
'Dead'  = 1
'Alive' = 0;
Run;
data heart;
set sashelp.heart;
Status_num= input( status, status. );
run;

proc tabulate data=heart;
class BP_Status sex;
var status_num;
table sex * (sum='Deaths' pctsum<bp_status>='% of row' pctsum<sex>='% of column' pctsum) all,bp_status*status_num;
run;
Percentage based on the total of analysis variable
Percentage based on the total of analysis variable

<bp_status>sums the values of Deaths for all occurrences of BP_STATUS within the same value of SEX. Thus, for SEX=”FEMALE”, the denominator is 529+ 248+82 = 859 and the row percentage is 529/859*100 =61.58.

<BP_STATUS>sums the frequency counts for all occurrences of SEX within the same value of BP_STATUS. Thus, for BP_STATUS=”HIGH”, the denominator is 529+602 =1131 and the Percentage is 529/1131*100 = 46.77%

The third use of PCTN has no denominator definition; therefore, all class variables (BP_STATUS and SEX)are included in the denominator definition.

Thus, for all cells, the denominator is 529 + 602 + 248 + 379 + 82 + 82 = 1922 and the percentage is 529/1922*100 = 27.52%

You can also read Tips for Generating Percentages Using the SAS TABULATE Procedure for more advanced usage of PROC Tabulate.

 

Every week we'll send you SAS tips and in-depth tutorials

JOIN OUR COMMUNITY OF SAS Programmers!

Subhro

Subhro provides valuable and informative content on SAS, offering a comprehensive understanding of SAS concepts. We have been creating SAS tutorials since 2019, and 9to5sas has become one of the leading free SAS resources available on the internet.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This Post Has One Comment

  1. Ahmed

    I want tables for frequency meter px3