How Levels.fyi Built Scalable Search with PostgreSQL
7 must-know strategies to scale your database
Indexing:
Check the query patterns of your application and create the right indexes.
Materialized Views:
Pre-compute complex query results and store them for faster access.
Denormalization:
Reduce complex joins to improve query performance.
Vertical Scaling
Boost your database server by adding more CPU, RAM, or... See more
Indexing:
Check the query patterns of your application and create the right indexes.
Materialized Views:
Pre-compute complex query results and store them for faster access.
Denormalization:
Reduce complex joins to improve query performance.
Vertical Scaling
Boost your database server by adding more CPU, RAM, or... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Nicolay Gerold added
- Scalability is crucial - systems need to be designed with the assumption that query volume, document corpus size, indexing complexity etc. could increase by 10x. What works at one scale may completely break at a higher scale.
- Sharding the index, either by document or by word, is important to distribute the indexing and querying load across machines.
Claude
Nicolay Gerold added
Eventually their database became a ticking time bomb ready to explode at any moment. The first instinct might be to beef up their Postgres instance, like me trying to get big at the gym. But there are physical limits to how much you can scale a single machine before costs increase exponentially. Moreover, query performance and maintenance processes
... See moreKiki's Bytes • How Notion Scaled to 100 Million Users Without Their Database Exploding
Scaling Causal's Spreadsheet Engine From Thousands to Billions of Cells: From Maps to Arrays
Simon Eskildsen (simon@sirupsen.com)sirupsen.comDenormalization
Another way Reddit minimizes joins is by using denormalization.
They took all the metadata fields required for displaying an image post and put them together into a single JSONB field. Instead of fetching different fields and combining them, they can just fetch that single JSONB field.
This made it much more efficient to fetch all the ... See more
Another way Reddit minimizes joins is by using denormalization.
They took all the metadata fields required for displaying an image post and put them together into a single JSONB field. Instead of fetching different fields and combining them, they can just fetch that single JSONB field.
This made it much more efficient to fetch all the ... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Nicolay Gerold added
- Multiple indices. Splitting the document corpus up into multiple indices and then routing queries based on some criteria. This means that the search is over a much smaller set of documents rather than the entire dataset. Again, it is not always useful, but it can be helpful for certain datasets. The same approach works with the LLMs themselves.
- Cu
Matt Rickard • Improving RAG: Strategies
Nicolay Gerold added