LLMs
The need for better AI or LLM-specific infrastructure, along with the host of problems that come with non-deterministic of LLMs, means that there’s more software work ahead of us, not less. Abstraction layers like LLMs create more possibilities and thus, more work.
Is this a good thing or a bad thing? I’m not sure.
A great example of this is frontend... See more
Is this a good thing or a bad thing? I’m not sure.
A great example of this is frontend... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
GPT-4 Turbo performs better than our previous models on tasks that require the careful following of instructions, such as generating specific formats (e.g., “always respond in XML”). It also supports our new JSON mode, which ensures the model will respond with valid JSON. The new API parameter response_format enables the model to constrain its... See more
New models and developer products announced at DevDay
The way that most RLHF is done to date has the entire response from a language model get an associated score. To anyone with an RL background, this is disappointing, because it limits the ability for RL methods to make connections about the value of each sub-component of text. Futures have been pointed to where this multi-step optimization comes at... See more
Nathan Lambert • The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data
One interesting thing about LLMs is that they can actually recover (and without error loops). You can have a step that doesn't work right, and a later step can use its common-sense knowledge to ignore some of the missing results, conflicting information, etc. One of the problems with developing with LLMs is that the machine will often cover up... See more
Ask HN: What are some actual use cases of AI Agents right now? | Hacker News
In the simplest form, we can use the model’s detection confidence to determine a score. But even here there are quite a few options to choose from:
- Lowest confidence - the score is the lowest confidence of all detected objects
- Average confidence - average of all confidences of detected objects
- Minimizing confidence delta - difference between
Active Learning with Domain Experts, a Case Study in Machine Learning
What data to label?
One thing that is still confusing to me, is that we've been building products with machine learning pretty heavily for a decade now and somehow abandoned all that we have learned about the process now that we're building "AI".
The biggest thing any ML practitioner realizes when they step out of a research setting is that for most tasks accuracy has... See more
The biggest thing any ML practitioner realizes when they step out of a research setting is that for most tasks accuracy has... See more
Ask HN: What are some actual use cases of AI Agents right now? | Hacker News
You are assuming that the probability of failure is independent, which couldn't be further from the truth. If a digit recogniser can recognise one of your "hard" handwritten digits, such as a 4 or a 9, it will likely be able to recognise all of them.
The same happens with AI agents. They are not good at some tasks, but really really food at others.
Overview
Loki is our open-source solution designed to automate the process of verifying factuality. It provides a comprehensive pipeline for dissecting long texts into individual claims, assessing their worthiness for verification, generating queries for evidence search, crawling for evidence, and ultimately verifying the claims. This tool is... See more
Loki is our open-source solution designed to automate the process of verifying factuality. It provides a comprehensive pipeline for dissecting long texts into individual claims, assessing their worthiness for verification, generating queries for evidence search, crawling for evidence, and ultimately verifying the claims. This tool is... See more
Libr-AI • GitHub - Libr-AI/OpenFactVerification: Open-source solution designed to automate the process of verifying factuality
How can we make interacting with conversational models feel more natural?
Every conversational interface to a language model adopts the same pattern:
A chat history sidebar, with each conversation lasting just a few turns
New sessions always begin in a brand-new thread
Every user query must always elicit exactly one response
None of these assumptions... See more
Every conversational interface to a language model adopts the same pattern:
A chat history sidebar, with each conversation lasting just a few turns
New sessions always begin in a brand-new thread
Every user query must always elicit exactly one response
None of these assumptions... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
⚡ LitGPT
Pretrain, finetune, evaluate, and deploy 20+ LLMs on your own data
Uses the latest state-of-the-art techniques:
✅ flash attention ✅ fp4/8/16/32 ✅ LoRA, QLoRA, Adapter (v1, v2) ✅ FSDP ✅ 1-1000+ GPUs/TPUs
Lightning AI • Models • Quick start • Inference • Finetune • Pretrain • Deploy • Features • Training recipes (YAML)
Finetune, pretrain and... See more
Pretrain, finetune, evaluate, and deploy 20+ LLMs on your own data
Uses the latest state-of-the-art techniques:
✅ flash attention ✅ fp4/8/16/32 ✅ LoRA, QLoRA, Adapter (v1, v2) ✅ FSDP ✅ 1-1000+ GPUs/TPUs
Lightning AI • Models • Quick start • Inference • Finetune • Pretrain • Deploy • Features • Training recipes (YAML)
Finetune, pretrain and... See more