
Feature Engineering for Machine Learning

A feature is a numeric representation of raw data. There are many ways to turn raw data into numeric measurements, which is why features can end up looking like a lot of things. Naturally,
Alice Zheng, Amanda Casari • Feature Engineering for Machine Learning
.implementation
Each piece of data provides a small window into a limited aspect of reality. The collection of all of these observations gives us a picture of the whole. But the picture is messy because it is composed of a thousand little pieces, and there’s always measurement noise and missing pieces.
Alice Zheng, Amanda Casari • Feature Engineering for Machine Learning
.implementation .modelthinking
Practitioners agree that the vast majority of time in building a machine learning pipeline is spent on feature engineering and data cleaning. Yet, despite its importance, the topic is rarely discussed on its own.
Alice Zheng, Amanda Casari • Feature Engineering for Machine Learning
.implementation
Next, consider the scale of the features. What are the largest and the smallest values? Do they span several orders of magnitude? Models that are smooth functions of input features are sensitive to the scale of the input.
Alice Zheng, Amanda Casari • Feature Engineering for Machine Learning
.implementation
The first sanity check for numeric data is whether the magnitude matters. Do
Alice Zheng, Amanda Casari • Feature Engineering for Machine Learning
.implementation
Feature engineering is the process of formulating the most appropriate features given the data, the model, and the task.
Alice Zheng, Amanda Casari • Feature Engineering for Machine Learning
.implementation
Mastery is about knowing precisely how something is done, having an intuition for the underlying principles, and integrating it into one’s existing web of knowledge. One does not become a master of something by simply reading a book, though a good book can open new doors.
Alice Zheng, Amanda Casari • Feature Engineering for Machine Learning
.implementation .modelthinking mastery is contextual
Feature engineering is the act of extracting features from raw data and transforming them into formats that are suitable for the machine learning model.
Alice Zheng, Amanda Casari • Feature Engineering for Machine Learning
.modelthinking