GitHub - cliangyu/Cola: [NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"
Repository for the paper "The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning", including 1.84M CoT rationales extracted across 1,060 tasks"
Paper Link : https://arxiv.org/abs/2305.14045
Paper Link : https://arxiv.org/abs/2305.14045
kaistAI • GitHub - kaistAI/CoT-Collection: [Under Review] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
LLaVA v1.5, a new open-source multimodal model stepping onto the scene as a contender against GPT-4 with multimodal capabilities. It uses a simple projection matrix to connect the pre-trained CLIP ViT-L/14 vision encoder with Vicuna LLM, resulting in a robust model that can handle images and text. The model is trained in two stages: first, updated... See more
This AI newsletter is all you need #68
Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration
1 2 Chenyang Lyu, 3 Minghao Wu, 1 * Longyue Wang, 1 Xinting Huang,
1 Bingshuai Liu, 1 Zefeng Du, 1 Shuming Shi, 1 Zhaopeng Tu
1 Tencent AI Lab, 2 Dublin City University, 3 Monash University
* Longyue Wang is the corresponding author: vinnlywang@tencent.com
Macaw... See more
1 2 Chenyang Lyu, 3 Minghao Wu, 1 * Longyue Wang, 1 Xinting Huang,
1 Bingshuai Liu, 1 Zefeng Du, 1 Shuming Shi, 1 Zhaopeng Tu
1 Tencent AI Lab, 2 Dublin City University, 3 Monash University
* Longyue Wang is the corresponding author: vinnlywang@tencent.com
Macaw... See more
lyuchenyang • GitHub - lyuchenyang/Macaw-LLM: Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
Sometimes, gave the same task to multiple models, comparing and merging their outputs to maximize quality. It's like double bookkeeping: when you know something is prone to errors (or, in Al's case, hallucinations), it's best to give the same task to two or three different models. This significantly reduces the error rate.
The approach mirrors
... See more