GitHub - lyuchenyang/Macaw-LLM: Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
Inspired by @karpathy 's NotebookLM project, I gave the codebase of Llama-3 Architecture to NLM and used Rag to find the perfect images to sync with the generated audio.
The result exceeded my expectations. Google's NotebookLM is truly amazing :)
Here is a youtube link as well:... See more
Ankit Palx.com