# One Way Anova in SAS

One Way ANOVA is a parametric test. It is used to compare the means of two or more independent groups to determine whether there are any statistically significant differences between the means of three or more independent (unrelated) groups.

This article will provide a brief introduction to the one way ANOVA in SAS, including the assumptions of the test and when you should use this test.

Page Contents

## What is Anova and why do we need it?

In statistical analysis when the objective is to test the means of two groups, then T-test for one sample or two samples is used, but when the researcher deals with the samples where testing of more than two groups is required simultaneously, then analysis of variance is used.

T-testing may be used to obtain the findings for the same scenario, but the process for running T-tests on paired samples is laborious and time-consuming. Another downside of performing T-tests for every pairwise sample is that the level of significance increases as the number of tests increases.

Ronald Fisher developed the analysis of variance approach in 1918 in order to obtain quicker findings.

The ANOVA, or analysis of variance, basically analyzes the different estimates by utilizing the F-distribution to evaluate if the population means are equal or not.

Analysis of variance is regarded as one of the most powerful techniques to perform when working with data that can be categorized into groups. Excel, SAS, SPSS, Minitab, and R programming can be used to perform variance analysis.

Analysis of variance is beneficial when there is a difference between the means of the group, indicating that it is required to consider which of the means is causing the difference, and for this, numerous ad-hoc tests can be performed.

The one-way and two-way analyses of variance are generally implemented in real-world data sets.

The term “one way Anova” refers to two levels of a single independent variable, whereas “two way Anova” refers to more than two or more levels or groups of a two-variable.

To carry out the analysis of variance, it is essential to check for the following assumptions:

**Normality-**The population should be normal or close to normal distribution.**Independence-**Samples selected from the population should be independent.**Homogeneity of Variances-**The variances deduced from the population must be the same.- The effects are additive which means that the X
_{ij}is the component with the observation of three components:

Where is the overall mean, is the treatment effect and is the random error.

The hypothesis for the analysis of variance is:

H0: There is no significant difference between the scores of the students

Or

H1: At the smallest one, the means are not equal.

The hypothesis for the analysis of variance is:

H_{0}: There is no significant difference between the scores of the students

Or

H_{1}: At the smallest one, the means are not equal.

## ANOVA with Repeated Measure

ANOVA with repeated measure is similar to one-way analysis of variance in a way that the same groups are analyzed repeatedly with different conditions.

For example, a group of patients is being monitored for their diabetes level for medication during the previous six months. Diabetes is the dependent variable in this situation.

When the cost is low and the same observations are being observed repeatedly, the analysis of variance with repeated measures is used.

Also, because the sample is not divided into groups, the sample size stays constant, and most significantly, when data from the same observations are gathered repeatedly over time, the odds of disparities diminish.

The assumptions that need to be followed while conducting an analysis of variances with repeated measures are:

- The Dependent variables should be of the categorical type and the independent should be consist of continuous.
- The variances of all the groups should be equal.

## One Way Anova test in SAS

To study an example of one way ANOVA in SAS, let’s consider the data set of three students of different majors with their scores. The hypothesis for this study is:

H_{0}: There is no significant difference between the scores of the students

H_{1}: At the smallest one, the means are not equal.

In the below example, *Score* is the target variable and *students* is an independent variable.

*Level of significance=0.05*

*Test Statistics:*

One-way ANOVA can be conducted by using the PROC ANOVA Procedure. The first step for conducting an analysis of variance is to load the data or enter the data.

**Step 1:** Create a Sample data

data student_scores; input f1 f2 f3; datalines; 17 12 13 23 23 31 25 34 25 25 29 21 26 25 20 25 24 43 18 31 23 15 32 45 21 18 12 34 15 13 42 19 27 32 12 32 ; run; |

**Step 2: **Transform the data if required.

After creating the data set, we need to transform the data set from a single Observation to multiple observations.

data student_scores2; set student_scores; array f{3} f1-f3; do ind=1 to 3; scores=f{ind}; f_ind=cats("f", ind); output; end; keep scores f_ind; run; proc sort data=student_scores2; by f_ind; run; |

**Step : 3 ** Check the Assumptions

Before carrying out the analysis it is important to get information about the nature of the data and the assumptions for the analysis, for ANOVA we need to check the normality assumptions.

`Proc Means`

Procedure can be used to check the basic statistics of the dataset. The Normality can be checked by plotting a box blot.

proc means data=student_scores2 mean median min max q1 q3 maxdec=2 nonobs; class f_ind; run; |

proc sgplot data=student_scores2; vbox scores /category=f_ind; run; |

According to the summary statistics, the variables are normal since the median is the middle value of the boxplot and the distance between the whiskers is equal. Since the most important assumption has been met, we can perform the analysis of variance in SAS.

The ANOVA testing for one independent variable with three groups is shown in step 4. The sum of square residuals is (SS=2618.833), while the degree of freedom is (dof= 33).

**Step: 4** Conduct the ANOVA test.

SAS has a procedure called SAS PROC ANOVA which allows us to perform Analysis of Variance.

`PROC ANOVA`

procedure has two statements, a `CLASS`

statement to give a name of a categorical variable. And `MODEL`

statement helps us to give a structure of model or analysis.

proc ANOVA data=student_scores2; title Example of one-way ANOVA; class f_ind; model scores=f_ind; run; |

The “Class Level Information” table is shown above. It lists the variables that appear in the CLASS statement, their levels, and the number of observations in the data set.

The next set of outputs displays the ANOVA table, followed by some simple statistics and tests of effects.

The PROC ANOVA output shows the F test for the analysis of variances as well as the P-value. According to the decision rule if the P-value is less than the pre-assigned alpha value then we reject H_{0} and conclude that there are significant differences between the means of the groups.

As, the (P-value = 0.731) which is greater than the pre-assigned alpha level (α=0.05). Hence, we accept H_{0 }and conclude that there is no significant difference between the groups.

When ODSGraphics is enabled and you fit a one-way analysis of variance model, the ANOVA procedure output includes a box plot of the dependent variable values within each classification level of the independent variable.

## Conclusion

Analysis of variance is one of the powerful statistics techniques to test the multiple means simultaneously, instead of testing each pair individually. The assumptions need to be fulfilled in order to carry out the analysis. It gives valid and faster results and is highly useful if the data is categorical. ANOVA provides the control over the type I error and also is it the parametric results, hence provide valid results if the assumption of normality is fulfilled.

## Article Sources

- Faraway, J. J. (2002). Practical Regression and ANOVA using R.
*University of Bath.*, (Vol. 168). - Girden, E. R. (1992). ANOVA: Repeated measures.
*Sage*, No. 84). - Miller Jr, R. G. ( (1997). ). Beyond ANOVA: basics of applied statistics.
*CRC press*. - St, L. &. (1989). Analysis of variance (ANOVA).
*Chemometrics and intelligent laboratory systems*, 6(4), 259-272.