This is the paper for Voxtral
Last week, Mistral released an open-sourced voice model that could rival paid voice AI.
Voxtral comes in 24B and 3B versions, both run locally.
Both use a Whisper encoder to turn raw waveforms into embeddings, a 4x downsampling adapt... See more