AI Product Manager Glossary
Quantization
A method to shrink the size of a model and make it run faster. It works by converting the model’s internal numbers from high precision (like 32-bit floats) to smaller ones (like 8-bit integers). This helps reduce memory usage and speed up inference with only a small drop in accuracy.
A method to shrink the size of a model and make it run faster. It works by converting the model’s internal numbers from high precision (like 32-bit floats) to smaller ones (like 8-bit integers). This helps reduce memory usage and speed up inference with only a small drop in accuracy.
AI Product Manager Glossary
Time to First Token (TTFT)
The time it takes from sending a request to when the first token of the response appears. Even if the full answer takes longer, a fast TTFT makes the system feel more responsive to the user.
The time it takes from sending a request to when the first token of the response appears. Even if the full answer takes longer, a fast TTFT makes the system feel more responsive to the user.
AI Product Manager Glossary
Inference
The process of using a trained model to make predictions on new data. In the case of large language models, this means generating a response based on the input prompt. It’s what happens when you “ask” the model something.
The process of using a trained model to make predictions on new data. In the case of large language models, this means generating a response based on the input prompt. It’s what happens when you “ask” the model something.
AI Product Manager Glossary
Failure Modes
Coherent and non-overlapping categories of errors that emerge from analyzing LLM traces (e.g., hallucination, incorrect format, or missed instruction). Each binary failure type is easy to recognize and forms the basis for targeted metrics.
Coherent and non-overlapping categories of errors that emerge from analyzing LLM traces (e.g., hallucination, incorrect format, or missed instruction). Each binary failure type is easy to recognize and forms the basis for targeted metrics.
AI Product Manager Glossary
Bottom-Up vs. Top-Down Analysis
Two approaches to defining AI metrics. Bottom-up analysis identifies application-specific failure modes directly from the data. Top-down analysis applies generic metrics (like hallucination or toxicity) that may miss domain-specific nuances.
Two approaches to defining AI metrics. Bottom-up analysis identifies application-specific failure modes directly from the data. Top-down analysis applies generic metrics (like hallucination or toxicity) that may miss domain-specific nuances.
AI Product Manager Glossary
A framework of automated and human-in-the-loop evaluations (e.g., code-based scripts, LLM-as-judge checks) that measures whether an AI product “actually works,” supporting rapid learning cycles and alignment with user and business goals.
AI Product Manager Glossary
Neural Networks
Computational models inspired by the human brain, consisting of layers of interconnected nodes ("neurons") that process data to learn patterns.
Computational models inspired by the human brain, consisting of layers of interconnected nodes ("neurons") that process data to learn patterns.
AI Product Manager Glossary
Fine-Tuning
The process of taking a pre-trained model and continuing its training on a smaller, task-specific dataset. This adapts the model's behavior for specialized applications, like following specific style or formatting.
The process of taking a pre-trained model and continuing its training on a smaller, task-specific dataset. This adapts the model's behavior for specialized applications, like following specific style or formatting.