
Roman's Data Science: How to monetize your data

Hypotheses typically estimate a given distribution parameter such as the mean or median. This is then used to build a histogram
Roman Zykov • Roman's Data Science: How to monetize your data
In A/B tests, we work with two groups – a test group and a control group. Both need their own bootstrap.
Roman Zykov • Roman's Data Science: How to monetize your data
Z-test – for checking the mean of a normally distributed quantity. Student’s t-test – the same as a z-test, but for small samples (t < 100).
Roman Zykov • Roman's Data Science: How to monetize your data
Insight means the capacity to gain an understanding of the reasons for something occurring. This is precisely what analysts want to achieve.
Roman Zykov • Roman's Data Science: How to monetize your data
The first thing to be investigated is integration, and this brings us to our first hypothesis: Is the data from the online store transmitted to us correctly? Sixty to seventy percent of problems are typically dealt with at this stage.
Roman Zykov • Roman's Data Science: How to monetize your data
When I hear the word “distribution,” I imagine a histogram showing the frequency of occurrences of a given event.
Roman Zykov • Roman's Data Science: How to monetize your data
One problem with all of these tests is that they are distribution-specific. For example, the Student’s t-test and the z-test require normally distributed data.
Roman Zykov • Roman's Data Science: How to monetize your data
In data analysis, survival bias is taking the known into account while neglecting the unknown (which nevertheless exists).
Roman Zykov • Roman's Data Science: How to monetize your data
nonparametric methods are better suited to small samples. There’s no point using nonparametric statistics if you’ve got a lot of data (for example n > 100).