LLMs
A new v0.4.0 release of lm-evaluation-harness is available !
New updates and features include:
New updates and features include:
- Internal refactoring
- Config-based task creation and configuration
- Easier import and sharing of externally-defined task config YAMLs
- Support for Jinja2 prompt design, easy modification of prompts + prompt imports from Promptsource
- More advanced configuration
GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.
GPT-4 Turbo performs better than our previous models on tasks that require the careful following of instructions, such as generating specific formats (e.g., “always respond in XML”). It also supports our new JSON mode, which ensures the model will respond with valid JSON. The new API parameter response_format enables the model to constrain its... See more
New models and developer products announced at DevDay
Langfuse is an open source observability & analytics solution for LLM-based applications. It is mostly geared towards production usage but some users also use it for local development of their LLM applications.
Langfuse is focused on applications built on top of LLMs. Many new abstractions and common best practices evolved recently, e.g. agents,... See more
Langfuse is focused on applications built on top of LLMs. Many new abstractions and common best practices evolved recently, e.g. agents,... See more
langfuse • GitHub - langfuse/langfuse: Open source observability and analytics for LLM applications
Google Deepmind used similar idea to make LLMs faster in Accelerating Large Language Model Decoding with Speculative Sampling. Their algorithm uses a smaller draft model to make initial guesses and a larger primary model to validate them. If the draft often guesses right, operations become faster, reducing latency.
There are some people speculating... See more
There are some people speculating... See more
muhtasham • Machine Learners Guide to Real World - 2️⃣ Concepts from Operating Systems That Found Their Way in LLMs
You can think your way into solving a deterministic system, but you cannot think your way into solving a probabilistic system.
The first thing that I want to call out is that deterministic software has edge cases, while probabilistic software has long tails.
I find that a lot of junior folks try to really think hard about edge cases around... See more
Jason Liu • Tips for probabilistic software - jxnl.co
Since we launched ChatGPT Enterprise a few months ago, early customers have expressed the desire for even more customization that aligns with their business. GPTs answer this call by allowing you to create versions of ChatGPT for specific use cases, departments, or proprietary datasets. Early customers like Amgen, Bain, and Square are already... See more
Introducing GPTs
This could be a business opportunity: building GPTs for companies.
The way that most RLHF is done to date has the entire response from a language model get an associated score. To anyone with an RL background, this is disappointing, because it limits the ability for RL methods to make connections about the value of each sub-component of text. Futures have been pointed to where this multi-step optimization comes at... See more
