How to choose a statistical test?

If you are from a non-statistical background, it is essentially the most complicated aspect of statistics, are always the basic statistical tests, and how to choose statistical tests?

In this article, we have tried to mark the distinction between the common statistical tests and the null value hypothesis in these tests by outlining the circumstances below which a particular test needs to be used.

Null Hypothesis and Testing

Before comparing different tests and choosing a statistical test, you must understand what a null hypothesis is. A null hypothesis tells that no significant difference exists in a set of given observations.

Null: Given two sample means are equal.
Alternate: Given two sample means usually are not equal

For rejecting a null hypothesis, a test statistic is calculated. This test statistic is then compared with a critical value, and the hypothesis is rejected if it is greater than the critical value.

In the theoretical aspect, hypothesis tests are based on the notion of critical regions: the null hypothesis is rejected if the test statistic falls in the critical region. The critical values are the boundaries of the critical region.

If the test is one-sided, there will be just one critical value, but in other cases (like a two-sided t-test), there will be two.”

Critical Value

A critical value is a point on the scale of the test statistic past which we reject the null hypothesis and is derived from the level of significance alpha of the test.

Critical value tells us the likelihood of two sample means belonging to the same distribution. Higher the critical value means lower the likelihood of two samples belonging to the same distribution.

The common critical value for a two-tailed test is 1.96, which relies on 95% of the normal distribution area being within 1.96 standard deviations of the mean.

Critical values can be used to do hypothesis testing in the following ways:

  1. Calculate test statistic
  2. Calculate critical values based mostly on significance level alpha
  3. Compare test statistics with critical values.

If the test statistic is lower than the critical value, accept the hypothesis or else reject the hypothesis.

Before we move ahead with different statistical tests, it’s crucial to understand the difference between a sample and a population.

In statistics, “population” refers to the overall observations that may be made. E.g., if I would like to calculate the average height of people on the earth, “population” would be the “total number of people present on the earth”.

A sample, alternatively, is a set of data collected from a pre-defined process. As in the example above, it will likely be a small group of individuals chosen randomly from some parts of the earth.

The sample must be random to draw inferences from a sample by validating a hypothesis.

For instance, if we choose individuals randomly from all areas(Asia, America, Europe, Africa and so forth.)on earth, our estimate can be close to the actual estimate. It will be assumed as a sample mean.

In contrast, if we make a selection, let’s say, solely from the United States, our average height estimate will not be correct however would solely represent the data of a selected area (United States).

Such a sample is then known as a biased sample and isn’t representative of the “population”.Another important aspect to understand in statistics is “distribution”.

When “population” is infinitely massive, it’s inconceivable to validate any hypothesis by calculating the mean value or test parameters on your entire population.

In such instances, a population is assumed to be of some kind of distribution. The commonest types of distributions are Binomial, Poisson and Discrete. However, there are numerous different types which are given in detail here.

Determining distribution type is important to decide the critical value and test to be chosen to validate any hypothesis.

Once you are clear on the concept of population, sample, and distribution, you will be able to understand different kinds of tests and the distribution types for which they’re used.

Relationship between p-value, critical value and test statistic

The critical value is some extent beyond which we reject the null hypothesis. Alternatively, P-value is the probability to the right of respective statistics (Z, T or chi).

The advantage of using a p-value is that it calculates a probability estimate that you can test at any desired level of significance by comparing this probability instantly with the significance level.

E.g., assume Z-value for a selected experiment comes out to be 1.67, which is greater than the critical value at 5%, which is 1.64.

A new critical value will be calculated to check for a different significance level of 1%. However, if you calculate the p-value for 1.67, it comes to 0.047.

You can use this p-value to reject the hypothesis at a 5% significance level since 0.047 < 0.05. But with an extra stringent significance level of 1%, the hypothesis can be accepted since 0.047 > 0.01.

The important point to observe here is that no double calculation is required.

Z-test

In a z-test, the sample is assumed to be normally distributed. A z-score is calculated with population parameters resembling “population mean” and “population standard deviation” and is used to validate the hypothesis that the sample drawn belongs to the same population.

Null: Given two sample means are equal
Alternate: Given two sample means usually are not equal

The statistics used for this hypothesis testing are known as z-statistic, the score for which is calculated as:

 z=\frac{(x-\mu)}{sigma/sqrt{n}}

where,

x= sample mean
μ = population mean
σ / √n = population standard deviation

If the test statistic is lower than the critical value, accept the hypothesis or else reject the hypothesis.

T-test

A t-test is used to compare the mean of two given samples. Like a z-test, a t-test also assumes the sample is normally distributed. A t-test is used when the population parameters (mean and standard deviation) are usually unknown. There are three variations of the t-test:

  1. Independent samples t-test, which compares the mean for two groups
  2. Paired sample t-test which compares means from the identical group at different times
  3. One sample t-test tests the mean of a single group against a known mean.

The statistic for this hypothesis testing is known as t-statistic, the score for which is calculated as:

t=\frac{x1-x2}{\sigma/sqrt{n1}+ \sigma/sqrt{n2}}

where

x1 = mean of sample 1
x2 = mean of sample 2
n1 = size of sample 1
n2 = size of sample 2

There are several variations of t-tests which you can find here.

ANOVA

ANOVA also referred to as analysis of variance, is used to compare multiple (three or more) samples with a single test.

There are two main flavours of ANOVA.

  • One-way ANOVA compares the difference between the three or more samples/groups of a single independent variable.
  • MANOVA: MANOVA allows us to test the effect of one or more independent variables on two or more dependent variables. In addition, MANOVA can detect the difference in co-relation between dependent variables given the groups of independent variables.

The hypothesis being tested in ANOVA is

Null: All pairs of samples are identical, i.e. all sample means are equal
Alternate: At least one pair of samples is considerably different.

In this case, the statistics used to measure the significance are known as F-statistics.

The F value is calculated using the below formula:

f=\frac{{((SSE1 - SSE2)/m)}}{{SSE2/n-k}}

where,

SSE = residual sum of squares
m = number of restriction
k = number of independent variables

Many tools exist, such as SPSS, R packages, Excel, etc., to perform ANOVA on a given sample.

Chi-Square Test

The Chi-square test is used to compare categorical variables. There are two kinds of chi-square tests.

  1. The goodness of fit test determines if a sample matches the population.
  2. A chi-square fit test for two independent variables is used to compare two variables in a contingency table to test if the data fits.
  • A small chi-square value implies that the data fits.
  • A high chi-square value implies that the data doesn’t match.

Null: Variable A and Variable B are independent.
Alternate: Variable A and Variable B aren’t independent.

The statistic used to measure significance, in this case, is known as the chi-square statistic.

The formula used for calculating the statistic is:

x^2 =\sum_{[Or,c - Er,c]}2 / Er,c

where

Or,c = observed frequency count at level r of Variable A and level c of Variable B
Er,c = expected frequency count at level r of Variable A and level c of Variable B.

Key Takeaway

As one can see from the above examples, in all of the tests, a statistic is compared with an important value to accept or reject a hypothesis.

However, the statistic and approach to calculating it differ depending on the variable type, the number of samples being analyzed and if the population parameters are known.

Thus, the null hypothesis is chosen based on such factors and appropriate tests.

If you liked this article, you might also want to read Confidence Interval for Population Mean and Central Limit Theorem.

Do you have any tips to add? Let us know in the comments.

Please subscribe to our mailing list for weekly updates. You can also find us on Instagram and Facebook.

Every week we'll send you SAS tips and in-depth tutorials

JOIN OUR COMMUNITY OF SAS Programmers!

Subhro Kar is an Analyst with over five years of experience. As a programmer specializing in SAS (Statistical Analysis System), Subhro also offers tutorials and guides on how to approach the coding language. His website, 9to5sas, offers students and new programmers useful easy-to-grasp resources to help them understand the fundamentals of SAS. Through this website, he shares his passion for programming while giving back to up-and-coming programmers in the field. Subhro’s mission is to offer quality tips, tricks, and lessons that give SAS beginners the skills they need to succeed.