how to do this, easy mode:
1. flux kontext, upload your image, and prompt the scene (https://t.co/as3Se7Zsgw)
2. Dump it into veo3 with prompts for animations
3. Eleven labs audio to audio with whosever voice you need
kontext will maintain consistency reasonably... See more
Before we begin, here's a quick demo of what we're building
Tech stack:
- @Cartesia_AI for SOTA text-to-speech
- @AssemblyAI for speech-to-text
- @LlamaIndex to power RAG
- @livekit for orchestration
Let's go! 🚀... See more