The term hypothesis testing is probably not new to you. The term significance testing is sometimes used. A statistical hypothesis is an assumption about a population measurable characteristic, known as a parameter in statistical language. Hypothesis testing is a formal procedure for rejecting or not rejecting statistical hypotheses. The chronological steps in the process of statistical hypothesis testing are:

  1. Specify the null hypothesis. This is the default position that is assumed to be true unless there is sufficient evidence against it. The null hypothesis is either that a parameter is greater than or equal to zero or that a parameter is less than or equal to zero.
  2. Specify the significance level. This is an arbitrary value compared to the probability value (p-value) so as to either reject or not reject the null hypothesis. It is the probability of making the wrong decision when the null hypothesis is true. Typical values are 0.05 and 0.01.
  3. Compute the probability value.
  4. Compare the probability value with the significance level. If the probability value is lower then you reject the null hypothesis. If your probability value is higher, most scientists will consider your findings inconclusive. Note that failure to reject the null hypothesis does not constitute support for the null hypothesis; it means you do not have sufficient evidence from the data to reject it. Also, the lower the probability value the more confidence you can have that the null hypothesis is false thus interpreting results on a continuous scale rather than binary (reject / do not reject) as some researchers prefer.

There are some criticisms of statistical hypothesis testing that you should keep in mind when undertaking the procedure. Significance tests tell you the probability of getting a result as extreme as that observed, given the null hypothesis is true. But you want to know is how likely it is that the null hypothesis is true given the observed results. Dichotomizing the decision has also been criticised in some research contexts where it is reject or do not reject decision is not appropriate.

Confusion also arises between statistical and substantive significance. The significance of a result depends on the size of the effect observed or estimated and whether it can be replicated. Statistical significance does not necessarily mean that the effect observed has contextual significance e.g. biological significance, but these two concepts are often confused. The use of statistical tests of significance almost always requires making distributional assumptions, which may be violated, about the test statistic in question, and on the use of strictly random sampling.

As a guide, to try and address some of the above issues, it is always advisable to: report parameter estimates with an indication of their variance such as their standard errors of confidence intervals, not to use the term significance loosely when referring to statistical significance, report the methods and results in a such a way that they can be usefully interpreted even without the significance tests.