Nicolay Gerold
@nicolaygerold
@nicolaygerold
LLM Training and Tools
Intuition : Prompt tokens add very little latency to completion calls. Time to generate completion tokens is much longer, as tokens are generated one at a time. Longer generation lengths will accumulate latency due to generation required for each token.