What is a Hypothesis Statement?

Milena Afeworki
3 min readJun 15, 2021

--

A statistical hypothesis is a belief made about a population parameter. This belief may or might not be right and so we use hypothesis testing to assess this assumption regarding that given population parameter. This population data may come from a large population or from a data-generating process. Although the most ideal approach to decide if a statistical hypothesis is correct is to examine the whole population, it is frequently impractical so we normally take a random sample from the population and inspect the equivalent.

Types of hypothesis:

There are two types of hypotheses and both must be totally mutually exclusive events.

  • Null hypothesis (Ho): is usually the hypothesis that the event won't happen.
  • Alternative hypothesis (Ha): is a hypothesis that the event will happen.

Why we need Hypothesis Testing?

Suppose a Real Estate agency XYZ wants to invest in Single-family houses with more than 4 bedrooms per sq. ft area of living space. For this situation, they would follow Hypothesis Testing to decide whether more bedrooms per Area means higher profits.

Here the likelihood that the number of bedrooms as a feature would be ineffective in the profit is undertaken as a Null Hypothesis and the likelihood of the feature being effective is undertaken as an Alternative Hypothesis. By following the process of Hypothesis testing they would foresee the accomplishment.

How to Calculate Hypothesis Testing?

  • Clearly state the two theories with the goal that only one can be correct.
  • Determine a study plan for how the data will be assessed.
  • Carry out the plan and thoroughly investigate the sample dataset.
  • Examine the outcome and either reject or fail to reject the null hypothesis.

Rejecting the null hypothesis

Significance Level and Rejection Region for Hypothesis

P Values.

After analysis of our data, we would want to declare that our study results are solid, or if they could have just happened by chance. The level of statistical significance is often expressed as a p-value between 0 and 1. Generally, we use 0.05 as a threshold measure for significance.

  • If the p-value is less than 0.05 (p < 0.05), we reject the null hypotheses and conclude that a significant difference does exist in our outcome. But this does not mean that there is a 95% probability that the alternative hypothesis is true. The p-value is conditional upon the null hypothesis being true but is unrelated to the truth or falsity of the alternative hypothesis.
  • If the p-value is larger than 0.05, we fail to reject the null hypothesis and cannot conclude that a significant difference exists .
image from (www.medical-institution.com/what-is-p-value-video-lecture)
image from (www.medical-institution.com/what-is-p-value-video-lecture)

Taking the Real estate agency example, let’s say we have a data set of property sales prices of a county. And in this data set, we have different features of a house such as price, number of bedrooms, number of bathrooms, sq. ft area of living …etc.

For the Null hypothesis, we will say that the change in bedroom numbers has no significant effect on the price of a property.

And for the Alternative hypothesis, we will say the number of bedrooms has a significant effect on the price of a property.

Now if after following the steps for hypothesis testing on our data, we find out the p-value for bedrooms to be greater than 0.05 (say p=0.07), then we would fail to reject the null hypothesis as it is pointing out that there is a 7% possibility that our results could be purely by chance.

However, if our p-value turns out to be less than 0.05 (say p=0.02), then we would reject the null hypothesis and conclude that there is a significant effect of the number of bedrooms on the price of a property.

--

--

Milena Afeworki
Milena Afeworki

Written by Milena Afeworki

Data Scientist and Former Structural Engineer for a Consulting company. Always desire to learn something new!

No responses yet