LLMs
- Traditional AI - The most secure, understandable, and performant. However, Good implementations of traditional AI require that we define the rules behind the system, which makes it unfeasible for many of the use cases that the other 2 techniques thrive on.
- Supervised Machine Learning- Middle of the road b/w AI and Deep Learning. Good when we have
Devansh • How to Pick between Traditional AI, Supervised Machine Learning, and Deep Learning [Thoughts]
Where would I add generative AI? Generative AI has the ease of accessibility of traditional AI, where people think it is understandable, but it does not have that feature in itself. It also has the opaque and costly nature of DL. Many companies are at the moment rushing into developing things with generative AI without having any prior foundation in AI and any processes set up to manage it: data ops, devops, …
Traditional AI forces you to think about how something works, understand the system, and then define the rules for it. ML lets you use features and feature importance to shortcut some. Deep Learning allows you to brute force it. Generative AI allows you to brute force without any background in DL.
Self-play is the idea that an agent can improve its gameplay by playing against slightly different versions of itself because it’ll progressively encounter more challenging situations. In the space of LLMs, it is almost certain that the largest portion... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
These two components might be some of the most important ideas to improve all of AI.
- Query the RAG anyway and let the LLM itself chose whether to use the the RAG context or its built in knowledge
- Query the RAG but only provide the result to the LLM if it meets some level of relevancy (ie embedding distance) to the question
- Run the LLM both on it's own and with the RAG response, use a heuristic (or another LLM) to pick the best answer
r/LocalLLaMA - Reddit
November 13, 2023・By Together
The Together Inference Engine is multiple times faster than any other inference service, with 117 tokens per second on Llama-2-70B-Chat and 171 tokens per second on Llama-2-13B-Chat
Today we are announcing Together Inference Engine, the world’s... See more
Announcing Together Inference Engine – the fastest inference available
Table of Contents
- Introduction
- Key LLM Serving Techniques
- Dynamic SplitFuse: A Novel Prompt and Generation Composition Strategy
- Performance Evaluation
- DeepSpeed-FastGen: Implementation and Usage
- Try out DeepSpeed-FastGen
- Acknowledgements
1. Introduction
Large... See more
microsoft • DeepSpeed-FastGen
- LLM in the loop with preference optimization
- synthetic data generation
- cross modality "distillation" / dictionary remapping
- constrained decoding
r/MachineLearning - Reddit
Additional LLM paradigms beyond RAG
- Why CrewAI
- Getting Started
- Key Features
- Examples
- Local Open Source Models
- CrewAI x AutoGen x ChatDev
- Contribution
- 💬 CrewAI Discord Community
- Hire Consulting
- License
joaomdmoura • GitHub - joaomdmoura/crewAI: Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
Is this a good thing or a bad thing? I’m not sure.
A great example of this is frontend... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Highlights
👉 Top AI use cases are code intelligence, data extraction and workflow... See more