PROC TABULATE in SAS is used to display descriptive statistics in table format. PROC TABULATE has a number of statements that define how this procedure will summarize the data.
The PROC TABULATE procedure is a substitute for PROC MEANS and PROC FREQ.
Proc Report and Proc Tabulate are both used to create summary tables. Although, Proc Report can create detail tables with COMPUTE Blocks which cannot be done using PROC Tabulate.
In PROC REPORT, everything is done with the COLUMN and DEFINE statements, along with COMPUTE blocks.
With PROC REPORT, you can have custom lines between groups for example “TOTAL SALES of SUV cars in USA Region” whereas TABULATE Procedure does not have this feature.
For more details on which procedure to use on which scenarios, you can read the article on PROC TABULATE versus PROC REPORT.
The 3 statements are necessary for every Proc Tabulate step.
|CLASS||The CLASS statement is used to form groups within rows and columns.|
|VAR||The VAR statements include the numeric variables which need to be summarized.|
|TABLE||The TABLE statements contain the table definition.|
|PAGE||Pages defines how the individual pages are formed.|
|ROW||It is the most used dimension and it defines the rows of the table within each page.|
|COLUMN||Columns are always present and it define the columns within rows and pages.|
The general syntax of the TABLE statement is
table page, row, column;
|SINGULAR||It is used when a single element is needed.|
|CONCATENATED||Using this option you can join or concatenate multiple elements using a space.|
|NESTED||One element is nested within another to form a hierachy using an asterisk|
A singular element consists of a single variable. A basic example os singular table is below.
title "Single Column Table Example"; proc tabulate data=sashelp.class; class sex; var weight; table sex,weight; run;
The analysis variable, Weight, is specified in the VAR statement, and a single column.
Since there is no statistic specified, the default statistic of SUM is displayed.
You can combine multiple elements within columns and/or rows using the concatenated statement. A concatenated definition is formed when two or more space-separated elements are included in the same dimension.
title "Concatenated Elements Example"; proc tabulate data=sashelp.cars format=dollar8.; class Origin Type; var MSRP INVOICE; table Origin Type, MSRP INVOICE; run;
There are two classification variables(Origin and Type) in the above example. The label associated with each analysis variable is by default used in the column header.
The analysis and classification variables can be used in a page, row, or column dimensions.
These are designated as nested elements through the use of the asterisk.
title "Nested Elements Example"; proc tabulate data=sashelp.cars; class ORIGIN TYPE; var MSRP; table ORIGIN,TYPE*MSRP*(N max min); run;
Three space-separated statistics are concatenated into a group with parenthesis and then the group is nested under the variable TYPE.
Combination of Elements
TABLE statement can contain a combination of nested and concatenated elements.
These will include not only variables and statistics but options as well.
proc tabulate data=sashelp.cars; class Origin Type; var MSRP; table Type='Type of cars' all='Total' origin all='Total', msrp='Price in dollars'*(n max*f=dollar8. min*f=dollar8. mean) /box='Type of cars across regions'; keylabel n='Number of cars' max="Maximum Price" min='Minimum Price' mean='Average Price'; run;
The table definition has two concatenated elements(Type and Origin) in the row dimension.
The ALL keyword summarizes across the associated elements which are Type and Origin.
Dollar Format(f=dollar8.) is used for Max and Min Statistics.
The BOX= option adds the text in the upper left corner of the table.
The KEYLABEL statement allows you to assign a text label to statistics and to the keyword ALL
Calculating Percentages Using PROC TABULATE
The PCTN and PCTSUM options request the calculation of percentages based on the denominator specified using angle brackets.
PCTN calculates the percentages based on counts (N), while PCTSUM calculates the percentages based on the total of an analysis variable.
Percentage based on counts
proc tabulate data=sashelp.cars; class origin drivetrain; table origin,drivetrain *(n pctn<drivetrain>='%') all; run;
Within each value of ORIGIN, the percentage of observations for each value of DRIVETRAIN is calculated.
Since PCTN is nested within ORIGIN, the denominator(DRIVETRAIN) is the total count for that value of ORIGIN.
From the above table, we can interpret that, 21.52% (out of 158 ) of All-Wheel drive is from Asia
Percentage Generation Statistics
Instead of providing denominators in angular brackets, you can also use several percentage generation statistics. For these statistics, the denominator is predetermined.
The denominator can be based on report, page or column.
|Percentage applies to:||Per cent Frequency||Per cent Total|
proc tabulate data=sashelp.cars(where=(upcase(type) in ('SUV','SEDAN','SPORTS'))); class origin drivetrain type; tables origin*(type all)all, drivetrain*(n colpctn rowpctn pctn<type all>="Type PCTN" pctn<drivetrain>="DriveTrain PCTN") all(n colpctn rowpctn pctn<type all>); run;
Percentage based on the total of analysis variable
Proc Format; inValue status 'Dead' = 1 'Alive' = 0; Run; data heart; set sashelp.heart; Status_num= input( status, status. ); run; proc tabulate data=heart; class BP_Status sex; var status_num; table sex * (sum='Deaths' pctsum<bp_status>='% of row' pctsum<sex>='% of column' pctsum) all,bp_status*status_num; run;
<bp_status>sums the values of Deaths for all occurrences of BP_STATUS within the same value of SEX. Thus, for SEX=”FEMALE”, the denominator is 529+ 248+82 = 859 and the row percentage is 529/859*100 =61.58.
<BP_STATUS>sums the frequency counts for all occurrences of SEX within the same value of BP_STATUS. Thus, for BP_STATUS=”HIGH”, the denominator is 529+602 =1131 and the Percentage is 529/1131*100 = 46.77%
The third use of PCTN has no denominator definition and therefore all class variables (BP_STATUS and SEX)are included in the denominator definition.
Thus, for all cells, the denominator is 529 + 602 + 248 + 379 + 82 + 82 = 1922 and the percentage is 529/1922*100 = 27.52%
You can also read Tips for Generating Percentages Using the SAS TABULATE Procedure for more advanced usage of PROC Tabulate.