We are excited to release the first version of our multimodal assistant Yasa-1, a language assistant with visual and auditory sensors that can take actions via code execution.
We trained Yasa-1 from scratch, including pretraining base models from ground zero, aligning them, as well as heavily optimizing both our training and serving infrastructure.
... See more