# One Way Anova in SAS

One way ANOVA is a parametric test. It compares the means of two or more independent groups to determine whether there are any statistically significant differences between the means of three or more independent (unrelated) groups.

This article will provide a brief introduction to the one way ANOVA in SAS, including the test’s assumptions and when you should use this test.

## What is Anova and why do we need it?

In statistical analysis when the objective is to test the means of two groups, then T-test for one sample or two samples is used, but when the researcher deals with the samples where testing of more than two groups is required simultaneously, the analysis of variance is used.

T-testing may be used to obtain the findings for the same scenario. Still, running T-tests on paired samples is laborious and time-consuming. Another downside of performing T-tests for every pairwise sample is that the level of significance increases as the number of tests increases.

Ronald Fisher developed the analysis of variance approach in 1918 to obtain quicker findings.

The ANOVA, or analysis of variance, basically analyzes the different estimates by utilizing the F-distribution to evaluate if the population means are equal or not.

Analysis of variance is regarded as one of the most powerful techniques to perform when working with data that you can categorize into groups. You can use Excel, SAS, SPSS, Minitab, and R programming to perform variance analysis.

Analysis of variance is beneficial when there is a difference between the means of the group, indicating that it is required to consider which of the means is causing the difference. For this, numerous ad-hoc tests can be performed.

The one-way and two-way analyses of variance are generally implemented in real-world data sets.

The term “one-way Anova” refers to two levels of a single independent variable, whereas “two-way Anova” refers to more than two or more levels or groups of a two-variable.

To carry out the analysis of variance, it is essential to check for the following assumptions:

• Normality- The population should be normal or close to normal distribution.
• Independence- Samples selected from the population should be independent.
• Homogeneity of Variances- The variances deduced from the population must be the same.
• The effects are additive, which means that the Xij is the component with the observation of three components: $X_ij = \mu + t_j + E_ij$

Where $\mu$ is the overall mean, $t_j$ is the treatment effect and $E_ij$ is the random error.

The hypothesis for the analysis of variance is:

H0: There is no significant difference between the scores of the students
Or
H1: At the smallest one, the means are not equal.

The hypothesis for the analysis of variance is:

H0: There is no significant difference between the scores of the students

Or $H_0 : \mu_1 = \mu_2 = \mu_3$

H1: At the smallest one, the means are not equal.

## ANOVA with Repeated Measure

ANOVA with the repeated measures is similar to a one-way analysis of variance in how the same groups are analyzed repeatedly under different conditions.

For example, a group of patients is being monitored for their diabetes level for medication during the previous six months. Diabetes is the dependent variable in this situation.

When the cost is low, and the same observations are being observed repeatedly, the analysis of variance with repeated measures is used.

Also, because the sample is not divided into groups, the sample size stays constant, and most significantly, when data from the same observations are gathered repeatedly over time, the odds of disparities diminish.

The assumptions that need to be followed while analyzing variances with repeated measures are:

• The Dependent variables should be of the categorical type, and the independent should consist of continuous.
• The variances of all the groups should be equal.

## One Way Anova test in SAS

To study an example of one way ANOVA in SAS, let’s consider the data set of three students of different majors with their scores. The hypothesis for this study is:

H0: There is no significant difference between the scores of the students

H1: At the smallest one, the means are not equal.

In the example below, Score is the target variable and students’ independent variable.

Level of significance=0.05

Test Statistics:

One-way ANOVA can be conducted by using the PROC ANOVA Procedure. The first step for conducting an analysis of variance is to load the data or enter the data.

Step 1: Create a Sample data

data student_scores;
input f1 f2 f3;
datalines;
17 12 13
23 23 31
25 34 25
25 29 21
26 25 20
25 24 43
18 31 23
15 32 45
21 18 12
34 15 13
42 19 27
32 12 32
;
run;


Step 2: Transform the data if required.

After creating the data set, we need to transform the data set from a single Observation to multiple observations.

data student_scores2;
set student_scores;
array f{3} f1-f3;
do ind=1 to 3;
scores=f{ind};
f_ind=cats("f", ind);
output;
end;
keep scores f_ind;
run;
proc sort data=student_scores2;
by f_ind;
run;

Step : 3  Check the Assumptions

Before carrying out the analysis, it is essential to get information about the nature of the data and the assumptions for the analysis, for ANOVA, we need to check the normality assumptions.

You can use proc Means Procedure to check the basic statistics of the dataset. The Normality can be checked by plotting a box blot

proc means data=student_scores2 mean median min max q1 q3 maxdec=2 nonobs;
class f_ind;
run;

proc sgplot data=student_scores2;
vbox scores /category=f_ind;
run;

According to the summary statistics, the variables are normal since the median is the middle value of the boxplot and the distance between the whiskers is equal. Therefore, since the most crucial assumption has been met, we can analyze variance in SAS.

The ANOVA testing for one independent variable with three groups is shown in step 4. The sum of square residuals is (SS=2618.833), while the degree of freedom is (dof= 33).

Step: 4 Conduct the ANOVA test.

SAS has a procedure called SAS PROC ANOVA, which allows us to perform an Analysis of Variance.

PROC ANOVA procedure has two statements, a CLASS statement to give a name of a categorical variable. And MODEL statement helps us to provide a structure of model or analysis.

proc ANOVA data=student_scores2;
title Example of one-way ANOVA;
class f_ind;
model scores=f_ind;
run;

The “Class Level Information” table is shown above. It lists the variables that appear in the CLASS statement, their levels, and the number of observations in the data set.

The next set of outputs displays the ANOVA table, followed by some simple statistics and tests of effects.

The PROC ANOVA output shows the F test for the analysis of variances and the P-value. According to the decision rule, if the P-value is less than the pre-assigned alpha value, we reject H0 and conclude that there are significant differences between the means of the groups.
The (P-value = 0.731) is greater than the pre-assigned alpha level (α=0.05). Hence, we accept H0 and conclude that there is no significant difference between the groups.

When ODSGraphics is enabled and you fit a one-way analysis of the variance model, the ANOVA procedure output includes a box plot of the dependent variable values within each classification level of the independent variable.

Conclusion

Analysis of variance is one of the powerful statistics techniques to test the multiple means simultaneously instead of testing each pair individually. The assumptions need to be fulfilled to carry out the analysis. It gives valid and faster results and is highly useful if the data is categorical. ANOVA provides the control over the type I error and is it the parametric results, hence providing valid results if the assumption of normality is fulfilled.

## Article Sources

1. Faraway, J. J. (2002). Practical Regression and ANOVA using R. University of Bath., (Vol. 168).
2. Girden, E. R. (1992). ANOVA: Repeated measures. Sage, No. 84).
3. Miller Jr, R. G. ( (1997). ). Beyond ANOVA: basics of applied statistics. CRC press.
4. St, L. &. (1989). Analysis of variance (ANOVA). Chemometrics and intelligent laboratory systems, 6(4), 259-272.

Every week we'll send you SAS tips and in-depth tutorials 