Scaling AI Models Like You Mean It

RelatedHighlights

Systems can be dynamic. Machine learning models are inherently limited because they are trained on static datasets, so their “knowledge” is fixed. Therefore, developers need to combine models with other components, such as search and retrieval, to incorporate timely data. In addition, training lets a model “see” the whole training set, so more com

Matei Zaharia, Omar Khattab, Lingjiao Chen, et al. • The Shift From Models to Compound AI Systems

Delivering Large-Scale Platform Reliability - Roblox Blog

Alberto Covarrubias blog.roblox.com

Scalability is crucial - systems need to be designed with the assumption that query volume, document corpus size, indexing complexity etc. could increase by 10x. What works at one scale may completely break at a higher scale.

Sharding the index, either by document or by word, is important to distribute the indexing and querying load across machines.

Claude

You’ve got a vector database that has all the right database fundamentals you require, has the right incremental indexing strategy for your use case, has a good story around your metadata filtering needs, and will keep its index up-to-date with latencies you can tolerate. Awesome.

Your ML team (or maybe OpenAI) comes out with a new version of their... See more

6 Hard Problems Scaling Vector Search

7 must-know strategies to scale your database

Indexing:

Check the query patterns of your application and create the right indexes.

Materialized Views:

Pre-compute complex query results and store them for faster access.

Denormalization:

Reduce complex joins to improve query performance.

Vertical Scaling

Boost your database server by adding more CPU, RAM, or... See more

Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]

Top considerations when choosing foundation models

Accuracy

Cost

Latency

Privacy

Top challenges when deploying production AI

Serving cost

Evaluation

Infra reliability

Model quality

Notion – The all-in-one workspace for your notes, tasks, wikis, and databases.

Thumbnail of The Next Generation Pixar: How AI will Merge Film & Games | Andreessen Horowitz

Jonathan Lai • The Next Generation Pixar: How AI will Merge Film & Games | Andreessen Horowitz

R1’s leap in capability and efficiency wouldn’t be possible without its foundation model, DeepSeek-V3, which was released in December 2024. V3 itself is big—671 billion parameters (by comparison, GPT4-o is rumored to be 1.8 trillion, or three times as big)—yet it’s surprisingly cost-effective to run. That’s because V3 uses a mixture of experts (MoE

Matei Zaharia, Omar Khattab, Lingjiao Chen, et al. • The Shift From Models to Compound AI Systems

Delivering Large-Scale Platform Reliability - Roblox Blog

Claude

6 Hard Problems Scaling Vector Search

Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]

Notion – The all-in-one workspace for your notes, tasks, wikis, and databases.

Jonathan Lai • The Next Generation Pixar: How AI will Merge Film & Games | Andreessen Horowitz

Evan Armstrong • What Actually Matters (And What Doesn’t) for DeepSeek