
Roman's Data Science: How to monetize your data

Hypotheses typically estimate a given distribution parameter such as the mean or median. This is then used to build a histogram
Roman Zykov • Roman's Data Science: How to monetize your data
One problem with all of these tests is that they are distribution-specific. For example, the Student’s t-test and the z-test require normally distributed data.
Roman Zykov • Roman's Data Science: How to monetize your data
Pearson’s chi-squared test – for categorical variables and all kinds of binomial tests. This is useful for calculating conversions (for example visitors to buyers) when you need a binomial test, such as whether a visitor to an online store made a purchase or not.
Roman Zykov • Roman's Data Science: How to monetize your data
I believe it is better to use one-sided hypotheses. After testing an idea, we try to improve the metric. Here, we are interested in whether it has improved or not (Hypothesis H1).
Roman Zykov • Roman's Data Science: How to monetize your data
nonparametric methods are better suited to small samples. There’s no point using nonparametric statistics if you’ve got a lot of data (for example n > 100).
Roman Zykov • Roman's Data Science: How to monetize your data
The parameter in the general population is true, and the sample parameter is an estimate of the true parameter.
Roman Zykov • Roman's Data Science: How to monetize your data
When I hear the word “distribution,” I imagine a histogram showing the frequency of occurrences of a given event.
Roman Zykov • Roman's Data Science: How to monetize your data
The go-to alternative for non-normal data is nonparametric tests.
Roman Zykov • Roman's Data Science: How to monetize your data
Fisher statistics, the p-value is a universal number that it understandable to statisticians and allows them to reject the null hypothesis. The p-value was not a thing before Fisher