GitHub - facebookresearch/multimodal at a33a8b888a542a4578b16972aecd072eff02c1a6

HOLY SHITT, Microsoft dropped an open-source Multimodal (supports Audio, Vision and Text) Phi 4 - MIT licensed! 🔥
> Beats Gemini 2.0 Flash, GPT4o, Whisper, SeamlessM4T v2
> Models on Hugging Face hub, integrated with/ Transformers!
Phi-4-Multimodal:... See more
