Machine Learning for Absolute Beginners: A Plain English Introduction (First Edition)
Oliver Theobaldamazon.com
Machine Learning for Absolute Beginners: A Plain English Introduction (First Edition)
As a supervised learning technique, clustering can be utilized to classify new data points into existing clusters through k-nearest neighbors (k-NN).
While it will depend on your exact dataset, 100-150 decision trees is often a recommended starting point.
Outside of market research, clustering can also be applied to various other scenarios, including pattern recognition and image processing.
One helpful approach to analyzing data is to identify clusters of data that share similar attributes.
Given its strength in binary classification, logistic regression is commonly used in fraud detection, disease diagnosis, emergency detection, loan default
Logistic regression adopts the sigmoid function to analyze data and predict discrete classes that exist in a dataset.
You will know if your model is accurate when the error rate between the training and test dataset is low. This means that the model has learned and understood the underlying patterns and trends found in the data.
k-means is an unsupervised learning algorithm and does not rely on known classes to classify new data points.
your company may wish to examine a subset of customers that purchase at the same time of year and discern what factors influence their purchasing behavior.