6 Hard Problems Scaling Vector Search
- Scalability is crucial - systems need to be designed with the assumption that query volume, document corpus size, indexing complexity etc. could increase by 10x. What works at one scale may completely break at a higher scale.
- Sharding the index, either by document or by word, is important to distribute the indexing and querying load across machines.
Claude
Nicolay Gerold added
Peter Hagen and added
Utilize both full-text search and vector search (embeddings) for retrieving relevant documents. Ideally, you should use a single database system to avoid synchronization issues.
- Implement both full-text search and vector search
- Test the performance of each method on your specific use case
- Consider using a single database system to store both types of
Systematically Improving Your RAG - jxnl.co
Nicolay Gerold added
- Multiple indices. Splitting the document corpus up into multiple indices and then routing queries based on some criteria. This means that the search is over a much smaller set of documents rather than the entire dataset. Again, it is not always useful, but it can be helpful for certain datasets. The same approach works with the LLMs themselves.
- Cu
Matt Rickard • Improving RAG: Strategies
Nicolay Gerold added
We think that the difficulty of storing and indexing data in a scalable way is something that has been underestimated by most of the crypto space. For a long time, the entire space has been limited to finite state applications without much consideration for the wide range of infinite state applications, like social apps and marketplaces, that in fa... See more
Deso • Web3 Will Not Be Built on Smart Contracts • DeSo (Decentralized Social) Blockchain
sari added
1. Synthetic Data for Baseline Metrics¶
Synthetic data can be used to establish baseline precision and recall metrics for your reverse search. The simplest kind of synthetic data is to take existing text chunks, generate synthetic questions, and verify that when we query our synthetic questions, the sourced text chunk is retrieved correctly.
Benefi... See more
Synthetic data can be used to establish baseline precision and recall metrics for your reverse search. The simplest kind of synthetic data is to take existing text chunks, generate synthetic questions, and verify that when we query our synthetic questions, the sourced text chunk is retrieved correctly.
Benefi... See more
Low-Hanging Fruit for RAG Search - jxnl.co
Nicolay Gerold added
There are numerous integrations for vector storage. These include Alibaba Cloud OpenSearch, AnalyticDB for PostgreSQL, Meta AI’s Annoy library for Approximate Nearest Neighbor (ANN) search, Cassandra, Chroma, Elasticsearch, Facebook AI Similarity Search (Faiss), MongoDB Atlas Vector Search, PGVector as a vector similarity search for Postgres, Pinec
... See more