LLMs
Here's my read on the situation:
* The TAM is massive, still so many businesses trying to figure out AI
* If you do deployments you’ll need to spend a of time hand holding clients through scoping projects (not unlike other dev works) since the material is so new
* Lot’s of opportunity in education
* The hard part isn’t the expertise, it’s distribution... See more
* The TAM is massive, still so many businesses trying to figure out AI
* If you do deployments you’ll need to spend a of time hand holding clients through scoping projects (not unlike other dev works) since the material is so new
* Lot’s of opportunity in education
* The hard part isn’t the expertise, it’s distribution... See more
Greg Kamradt • Tweet
Google Deepmind used similar idea to make LLMs faster in Accelerating Large Language Model Decoding with Speculative Sampling. Their algorithm uses a smaller draft model to make initial guesses and a larger primary model to validate them. If the draft often guesses right, operations become faster, reducing latency.
There are some people speculating... See more
There are some people speculating... See more
muhtasham • Machine Learners Guide to Real World - 2️⃣ Concepts from Operating Systems That Found Their Way in LLMs
Principles for growable tools
There are three critical pieces to building a tool that can grow around its users over time.
There are three critical pieces to building a tool that can grow around its users over time.
- Design around play . Sometimes I call this design around experimentation . Using the tool for day-to-day work should involve playing and experimenting with what’s possible with the tool. Whether that’s writing small programs to
Beyond customization: build tools that grow with us | thesephist.com
I’ve been giving talks and speaking with engineers and non-technical audiences about interpretability since 2022, and I still struggle to explain exactly what a “feature” is. I often use words like “concept” or “style”, or establish metaphors to debugging programs or making fMRI scans of brains. Both metaphors help people outside of the subfield... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
One thing that is still confusing to me, is that we've been building products with machine learning pretty heavily for a decade now and somehow abandoned all that we have learned about the process now that we're building "AI".
The biggest thing any ML practitioner realizes when they step out of a research setting is that for most tasks accuracy has... See more
The biggest thing any ML practitioner realizes when they step out of a research setting is that for most tasks accuracy has... See more
Ask HN: What are some actual use cases of AI Agents right now? | Hacker News
You are assuming that the probability of failure is independent, which couldn't be further from the truth. If a digit recogniser can recognise one of your "hard" handwritten digits, such as a 4 or a 9, it will likely be able to recognise all of them.
The same happens with AI agents. They are not good at some tasks, but really really food at others.
In the simplest form, we can use the model’s detection confidence to determine a score. But even here there are quite a few options to choose from:
- Lowest confidence - the score is the lowest confidence of all detected objects
- Average confidence - average of all confidences of detected objects
- Minimizing confidence delta - difference between
Active Learning with Domain Experts, a Case Study in Machine Learning
What data to label?
Study finds RLHF reduces LLM creativity and output variety : A new research paper posted in /r/LocalLLaMA shows that while alignment techniques like RLHF reduce toxic and biased content, they also limit the creativity of large language models, even in contexts unrelated to safety.
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
- Self-play is the idea that an agent can improve its gameplay by playing against slightly different versions of itself because it’ll progressively encounter more challenging situations. In the space of LLMs, it is almost certain that the largest portion of self-play will look like AI Feedback rather than competitive processes.
Nathan Lambert • The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data
pair-preference-model-LLaMA3-8B by RLHFlow: Really strong reward model, trained to take in two inputs at once, which is the top open reward model on RewardBench (beating one of Cohere’s).
DeepSeek-V2 by deepseek-ai (21B active, 236B total param.): Another strong MoE base model from the DeepSeek team. Some people are questioning the very high MMLU... See more
DeepSeek-V2 by deepseek-ai (21B active, 236B total param.): Another strong MoE base model from the DeepSeek team. Some people are questioning the very high MMLU... See more