Community Paper Reading

GPT-4V Safety and Deployment Preparation

Analysis of safety preparations and evaluations for GPT-4V, a multimodal language model with image analysis capabilities, including early access testing, red teaming, and mitigations for potential risks and limitations.

cdn.openai.com

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

robotics-transformer-x.github.io

AgentBench: Evaluating LLMs as Agents

Evaluating Large Language Models (LLMs) as agents in interactive environments, highlighting the performance gap between API-based and open-source models, and introducing the AgentBench benchmark.

arxiv.org

AgentBench: Evaluating LLMs as Agents

GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks

mit-han-lab github.com

Gorilla

gorilla.cs.berkeley.edu

The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)

An analysis of GPT-4V, a large multimodal model with visual understanding, discussing its capabilities, input modes, working modes, prompting techniques, and potential applications in various domains.

browse.arxiv.org