
Roman's Data Science: How to monetize your data

Insight means the capacity to gain an understanding of the reasons for something occurring. This is precisely what analysts want to achieve.
Roman Zykov • Roman's Data Science: How to monetize your data
nine out of ten hypotheses don’t pan out. But you have no idea that a hypothesis will not produce the desired result until you are well into the testing process. I believe that it is best to kill a hypothesis as early as possible – as soon as the first sign that the idea won’t take off presents itself.
Roman Zykov • Roman's Data Science: How to monetize your data
All measurements contain errors. This is a fact, get over it. Errors themselves should be noted and not considered errors as such (I’ll explain how we can monitor this in a later chapter).
Roman Zykov • Roman's Data Science: How to monetize your data
Kozyrkov implores us to “always evaluate decision quality based only on what was known at the time the decision was made.”
Roman Zykov • Roman's Data Science: How to monetize your data
I believe it is better to use one-sided hypotheses. After testing an idea, we try to improve the metric. Here, we are interested in whether it has improved or not (Hypothesis H1).
Roman Zykov • Roman's Data Science: How to monetize your data
Fisher statistics, the p-value is a universal number that it understandable to statisticians and allows them to reject the null hypothesis. The p-value was not a thing before Fisher
Roman Zykov • Roman's Data Science: How to monetize your data
In his 1925 monograph Statistical Methods for Research Workers, Ronald Fisher (the founder of hypothesis testing) outlined concepts such as the statistical significance criterion, the rules for testing statistical hypotheses, analysis of variance, and experiment planning. This work defined our current approach to experiment planning.
Roman Zykov • Roman's Data Science: How to monetize your data
nonparametric methods are better suited to small samples. There’s no point using nonparametric statistics if you’ve got a lot of data (for example n > 100).
Roman Zykov • Roman's Data Science: How to monetize your data
Classical machine learning can be divided into three types: supervised learning unsupervised learning reinforcement learning.