GitHub - stanford-futuredata/ColBERT: Stanford ColBERT: stat...

GitHub - stanford-futuredata/ColBERT: Stanford ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22)

github.com

RelatedHighlights

ColBERT: A Token-Level Embedding and Ranking Model - Zilliz Learn

David Wang zilliz.com

Sounds fancy. Why do we care? GAR involves taking the source documents and having an LLM enrich them, prior to indexing. For example, the LLM might... * Generate titles for documents that are missing them * Standardize author names/formats* Extract dates, URLs, citations and other elements that might be valuable to search as separate fields* Create... See more

Feed | LinkedIn

ColPali: Better Document Retrieval with VLMs and ColBERT Embeddings - Zilliz blog

Stephen Batifol zilliz.com

Pipeline RobustQA Avg. score Avg. response time (secs) Azure Cognitive Search Retriever + GPT4 + Ada 72.36 >1.0s Canopy (Pinecone) 59.61 >1.0s Langchain + Pinecone + OpenAI 61.42 <0.8s Langchain + Pinecone + Cohere 69.02 <0.6s LlamaIndex + Weaviate Vector Store - Hybrid Search 75.89 <1.0s RAG Google Cloud VertexAI... See more

arXiv:2405.02048v1 [cs.IR] 3 May 2024

Text embeddings are a critical piece of many pipelines, from search, to RAG, to vector databases and more. Most embedding models are BERT/Transformer-based and typically have short context lengths (e.g., 512). That’s only about two pages of text, but documents can be very long – books, legal cases, TV screenplays, code repositories, etc can be tens... See more

Long-Context Retrieval Models with Monarch Mixer

🥤 Cola [NeurIPS 2023]

Large Language Models are Visual Reasoning Coordinators

Liangyu Chen*,†,♥ Bo Li*,♥ Sheng Shen♣ Jingkang Yang♥

Chunyuan Li♠ Kurt Keutzer♣ Trevor Darrell♣ Ziwei Liu✉,♥

♥S-Lab, Nanyang Technological University

♣University of California, Berkeley ♠Microsoft Research, Redmond

*Equal Contribution †Project Lead ✉Corresponding Author... See more

ColBERT: A Token-Level Embedding and Ranking Model - Zilliz Learn

Feed | LinkedIn

ColPali: Better Document Retrieval with VLMs and ColBERT Embeddings - Zilliz blog

arXiv:2405.02048v1 [cs.IR] 3 May 2024

Long-Context Retrieval Models with Monarch Mixer

cliangyu • GitHub - cliangyu/Cola: [NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"