data:image/s3,"s3://crabby-images/4b0cd/4b0cda7ea7086e6234396b468d71e02a08f8dc97" alt="Thumbnail of Scaling AI Models Like You Mean It"
Scaling AI Models Like You Mean It
data:image/s3,"s3://crabby-images/4b0cd/4b0cda7ea7086e6234396b468d71e02a08f8dc97" alt="Thumbnail of Scaling AI Models Like You Mean It"
Matei Zaharia, Omar Khattab, Lingjiao Chen, et al. • The Shift From Models to Compound AI Systems
data:image/s3,"s3://crabby-images/c400c/c400cc5bf7901673332ea8252ab1e09e1c4d8bf2" alt="Thumbnail of Delivering Large-Scale Platform Reliability - Roblox Blog"
- Scalability is crucial - systems need to be designed with the assumption that query volume, document corpus size, indexing complexity etc. could increase by 10x. What works at one scale may completely break at a higher scale.
- Sharding the index, either by document or by word, is important to distribute the indexing and querying load across machines.
Claude
You’ve got a vector database that has all the right database fundamentals you require, has the right incremental indexing strategy for your use case, has a good story around your metadata filtering needs, and will keep its index up-to-date with latencies you can tolerate. Awesome.
Your ML team (or maybe OpenAI) comes out with a new version of their... See more
Your ML team (or maybe OpenAI) comes out with a new version of their... See more
6 Hard Problems Scaling Vector Search
7 must-know strategies to scale your database
Indexing:
Check the query patterns of your application and create the right indexes.
Materialized Views:
Pre-compute complex query results and store them for faster access.
Denormalization:
Reduce complex joins to improve query performance.
Vertical Scaling
Boost your database server by adding more CPU, RAM, or... See more
Indexing:
Check the query patterns of your application and create the right indexes.
Materialized Views:
Pre-compute complex query results and store them for faster access.
Denormalization:
Reduce complex joins to improve query performance.
Vertical Scaling
Boost your database server by adding more CPU, RAM, or... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Top considerations when choosing foundation models
Accuracy
Cost
Latency
Privacy
Top challenges when deploying production AI
Serving cost
Evaluation
Infra reliability
Model quality
Accuracy
Cost
Latency
Privacy
Top challenges when deploying production AI
Serving cost
Evaluation
Infra reliability
Model quality
Notion – The all-in-one workspace for your notes, tasks, wikis, and databases.
R1’s leap in capability and efficiency wouldn’t be possible without its foundation model, DeepSeek-V3, which was released in December 2024. V3 itself is big—671 billion parameters (by comparison, GPT4-o is rumored to be 1.8 trillion, or three times as big)—yet it’s surprisingly cost-effective to run. That’s because V3 uses a mixture of experts (MoE
... See more