Long-Context Retrieval Models with Monarch Mixer
Advances in data processing techniques. You can increase context length in two ways. First, you can train the model with longer context lengths. That’s difficult because it’s much more computationally expensive, and it’s hard to find datasets with long context lengths (most documents in CommonCrawl have fewer than 2,000 tokens).
The second, more com... See more
The second, more com... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Nicolay Gerold added
One of the focus areas at Together Research is new architectures for long context, improved training, and inference performance over the Transformer architecture. Spinning out of a research program from our team and academic collaborators, with roots in signal processing-inspired sequence models, we are excited to introduce the StripedHyena models.... See more
Paving the way to efficient architectures: StripedHyena-7B, open source models offering a glimpse into a world beyond Transformers
Nicolay Gerold added
- Cohere introduced Embed v3, an advanced model for generating document embeddings, boasting top performance on a few benchmarks. It excels in matching document topics to queries and content quality, improving search applications and retrieval-augmentation generation (RAG) systems. The new version offers models with 1024 or 384 dimensions, supports o
FOD#27: "Now And Then"
Nicolay Gerold added
LLMs struggle when handling tasks which require extensive knowledge. This limitation highlights the need to supplement LLMs with non-parametric knowledge. This paper Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts analyze the effects of different types of non-parametric knowledge, such as textu... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Nicolay Gerold added
memary: Open-Source Longterm Memory for Autonomous Agents
memary demo
Why use memary?
Agents use LLMs that are currently constrained to finite context windows. memary overcomes this limitation by allowing your agents to store a large corpus of information in knowledge graphs, infer user knowledge through our memory modules, and only retrieve relevan... See more
memary demo
Why use memary?
Agents use LLMs that are currently constrained to finite context windows. memary overcomes this limitation by allowing your agents to store a large corpus of information in knowledge graphs, infer user knowledge through our memory modules, and only retrieve relevan... See more
GitHub - kingjulio8238/memary: Longterm Memory for Autonomous Agents.
Nicolay Gerold added
Data
Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-α is the first model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO). We found that removing the in-built alignment o... See more
HuggingFaceH4/zephyr-7b-alpha · Hugging Face
Nicolay Gerold added
simply accessing LLMs via APIs has limitations. Instead, combining them with other data sources and tools can enable more powerful applications. In this chapter, we will introduce LangChain as a way to overcome LLM limitations and build innovative language-based applications.
Ben Auffarth • Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs
LangChain enables building dynamic, data-aware applications that go beyond what is possible by simply accessing LLMs via API calls.