GitHub - facebookresearch/multimodal at a33a8b888a542a4578b16972aecd072eff02c1a6
GitHub - danielmiessler/fabric: fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that...
Daniel Miesslergithub.com
GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks
mit-han-labgithub.comGitHub - transformerlab/transformerlab-app: Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.
github.com
voyage-multimodal-3: all-in-one embedding model for interleaved text, images, and screenshots
Voyage AIblog.voyageai.com
LLaVA v1.5, a new open-source multimodal model stepping onto the scene as a contender against GPT-4 with multimodal capabilities. It uses a simple projection matrix to connect the pre-trained CLIP ViT-L/14 vision encoder with Vicuna LLM, resulting in a robust model that can handle images and text. The model is trained in two stages: first, updated ... See more
This AI newsletter is all you need #68
Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration
1 2 Chenyang Lyu, 3 Minghao Wu, 1 * Longyue Wang, 1 Xinting Huang,
1 Bingshuai Liu, 1 Zefeng Du, 1 Shuming Shi, 1 Zhaopeng Tu
1 Tencent AI Lab, 2 Dublin City University, 3 Monash University
* Longyue Wang is the corresponding author: vinnlywang@tencent.com
Macaw... See more
1 2 Chenyang Lyu, 3 Minghao Wu, 1 * Longyue Wang, 1 Xinting Huang,
1 Bingshuai Liu, 1 Zefeng Du, 1 Shuming Shi, 1 Zhaopeng Tu
1 Tencent AI Lab, 2 Dublin City University, 3 Monash University
* Longyue Wang is the corresponding author: vinnlywang@tencent.com
Macaw... See more