Sublime
An inspiration engine for ideas

Mixture of In-Context Learners
Uses subsets of demonstrations to train experts via in-context learning. Given a training set, a trainable weighting function is used to combine the experts' next-token predictions.
This approach applies to black-box LLMs since access to the internal parameters... See more
Top-rated papers from ICLR 2025
Scaling In-the-Wild Training for Diffusion-based Illumination Harmonization and Editing by Imposing Consistent Light Transport
- Rating: 9.0
- https://t.co/Ybsw2296I9
OLMoE: Open Mixture-of-Experts Language Models
- Rating: 8.67
- https://t.co/KmaXgEym4T
Compositional Entailment Learning for Hyperbolic Vision-Language Models
- Rating: 8.0
- https://t.co/IuemddOOPO
The Complexity of Two-Team Polymatrix Games with Independent Adversaries
- Rating: 8.0
- https://t.co/XSuzLUV1b0
Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment
- Rating: 8.0
- https://t.co/7MfBTFltad
SAM 2: Segment Anything in Images and Videos
- Rating: 8.0
- https://t.co/U5DVeWndR5
Streaming Algorithms For $\ell_p$ Flows and $\ell_p$ Regression
- Rating: 8.0
- https://t.co/iWZQXiQtT6
Differential Transformer
- Rating: 8.0
- https://t.co/BjWsQfOeG9
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
- Rating: 8.0
- https://t.co/PMlfsBhJYR
Spider 2.0: Can Language Models Resolve Real-World Enterprise Text-to-SQL Workflows?
- Rating: 8.0
- https://t.co/qYBOVZr6FP
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
- Rating: 8.0
- https://t.co/9SchPdadSs
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models
- Rating: 8.0
- https://t.co/MiHjZOqkGC
Scaling... See more
jack morrisx.com

CogVLM https://t.co/W5n3fIZA5n
Main novelty - treating the LLM part as a MoE, where image and text parts are processed by 2 separate experts.
Image embeddings are by 2-layer MLP over ViT, same 0th position embedding added.
Trained in stages.
ViT params aren't finetuned. https://t.co/KiRCFOOwfr

The paper aims to improve the localization capability of the Contrastive Language-Image Pre-training (CLIP) model, which has become a popular foundation for multimodal large language models (MLLMs). CLIP aligns images and text at the image level, but its performance may be insufficient for downstream tasks that require fine-grained vision represent... See more
Video of my talk on self-supervised learning, energy-based models, and training methods for joint-embedding architectures (e.g. Siamese nets) in contrastive and non-contrastive modes.
Given at the French-German Symposium on ML.
(with panel discussion).
https://t.co/zCf4PBm9O7
Yann LeCunx.comAI Toolbox
Dave King • 12 cards