LLMs
MLServer aims to provide an easy way to start serving your machine learning models through a REST and gRPC interface, fully compliant with KFServing's V2 Dataplane spec. Watch a quick video introducing the project here.
- Multi-model serving, letting users run multiple models within the same process.
- Ability to run inference in parallel for vertical
GitHub - SeldonIO/MLServer: An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
GPT-4 Turbo can accept images as inputs in the Chat Completions API, enabling use cases such as generating captions, analyzing real world images in detail, and reading documents with figures. For example, BeMyEyes uses this technology to help people who are blind or have low vision with daily tasks like identifying a product or navigating a store.... See more
New models and developer products announced at DevDay
First time here? Go to our setup guide
Features
Features
- 🤖 Multiple model integrations: OpenAI, transformers, llama.cpp, exllama2, mamba
- 🖍️ Simple and powerful prompting primitives based on the Jinja templating engine
- 🚄 Multiple choices, type constraints and dynamic stopping
- ⚡ Fast regex-structured generation
- 🔥 Fast JSON generation following a JSON schema
outlines-dev • GitHub - outlines-dev/outlines: Neuro Symbolic Text Generation
Today, we’re releasing the Assistants API, our first step towards helping developers build agent-like experiences within their own applications. An assistant is a purpose-built AI that has specific instructions, leverages extra knowledge, and can call models and tools to perform tasks. The new Assistants API provides new capabilities such as Code... See more
New models and developer products announced at DevDay
Unlike consumers, enterprises want control over how their data is used and shared with companies, including the providers of AI software. Enterprises have spent a lot effort in consolidating data from different sources and bringing them in-house (this article Partner integrations + System of Intelligence: Today’s deepest Moat by fellow Medium... See more
AI Startup Trends: Insights from Y Combinator’s Latest Batch
- Right now, GPTs are the easiest way of sharing structured prompts, which are programs, written in plain English (or another language), that can get the AI to do useful things. I discussed creating structured prompts last week, and all the same techniques apply, but the GPT system makes structured prompts more powerful and much easier to create,
Ethan Mollick • Almost an Agent: What GPTs can do
pair-preference-model-LLaMA3-8B by RLHFlow: Really strong reward model, trained to take in two inputs at once, which is the top open reward model on RewardBench (beating one of Cohere’s).
DeepSeek-V2 by deepseek-ai (21B active, 236B total param.): Another strong MoE base model from the DeepSeek team. Some people are questioning the very high MMLU... See more
DeepSeek-V2 by deepseek-ai (21B active, 236B total param.): Another strong MoE base model from the DeepSeek team. Some people are questioning the very high MMLU... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Study finds RLHF reduces LLM creativity and output variety : A new research paper posted in /r/LocalLLaMA shows that while alignment techniques like RLHF reduce toxic and biased content, they also limit the creativity of large language models, even in contexts unrelated to safety.
.png?table=block&id=b4e186f9-aa38-4fce-b32e-8fdd8fc746ce&spaceId=996f2b3b-deaa-4214-aedb-cbc914a1833e&width=1260&userId=&cache=v2)