GitHub - stanford-futuredata/ColBERT: Stanford ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22)
Metaphor
metaphor.systems

With this modified code, you can now upload documents in multiple formats (.pdf, .txt, .docx, .pptx, .jpg, .png, .eml, .html) and semantically search in 100+ languages supported by @CohereAI. Embeddings saved @qdrant_engine. Flask-backend uses @LangChainAI. Code open-sourced.๐งต๐ https://t.co/5ysmkkJ9sO
Jina-ColBERT-v2 is here. https://t.co/FvBbeXsftS Superior retrieval performance vs the original ColBERT-v2 from @stanfordnlp (+6.5%) & our previous jina-colbert-v1-en(+5.4%). Multilingual support for 89 languages and programming languages. User-controlled output embedding sizes (128/96/64-dim) through Matryoshka representation learning, and finally... See more
Jina AIx.com