LLMs
Google Deepmind used similar idea to make LLMs faster in Accelerating Large Language Model Decoding with Speculative Sampling. Their algorithm uses a smaller draft model to make initial guesses and a larger primary model to validate them. If the draft often guesses right, operations become faster, reducing latency.
There are some people speculating... See more
There are some people speculating... See more
muhtasham • Machine Learners Guide to Real World - 2️⃣ Concepts from Operating Systems That Found Their Way in LLMs
Principles for growable tools
There are three critical pieces to building a tool that can grow around its users over time.
There are three critical pieces to building a tool that can grow around its users over time.
- Design around play . Sometimes I call this design around experimentation . Using the tool for day-to-day work should involve playing and experimenting with what’s possible with the tool. Whether that’s writing small programs to
Beyond customization: build tools that grow with us | thesephist.com
Disruptive innovation comes in two flavors: (1) New-market disruption, where the company creates and claims a new segment in an existing market by catering to an underserved customer base, or (2) Low-end disruption, in which a company uses a low-cost business model to enter at the bottom of an existing market and claim a segment.
Copilots don’t... See more
Copilots don’t... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Fine-Tuning for LLM Research by AI Hero
This repo contains the code that will be run inside the container. Alternatively, this code can also be run natively. The container is built and pushed to the repo using Github actions (see below). You can launch the fine tuning job using the examples in the https://github.com/ai-hero/llm-research-examples... See more
This repo contains the code that will be run inside the container. Alternatively, this code can also be run natively. The container is built and pushed to the repo using Github actions (see below). You can launch the fine tuning job using the examples in the https://github.com/ai-hero/llm-research-examples... See more
GitHub - ai-hero/llm-research-fine-tuning
- Mistral AI shows a promising alternative to the GPT 3.5 model using prompt engineering .
- Mistral AI can be used where it requires high volume and faster processing time with very little cost .
- Mistral AI can be used as pre-filtering to GPT 4 to reduce cost i.e. can be used to filter down search results .
Mistral 7B is 187x cheaper compared to GPT-4
Setting up the necessary machine learning infrastructure to run these big models is another challenge. We need a dedicated model server for running model inference (using frameworks like Triton oder vLLM), powerful GPUs to run everything robustly, and configurability in our servers to make sure they're high throughput and low latency. Tuning the... See more
Developing Rapidly with Generative AI
What is Substrate?
Substrate is an AI inference platform. In particular, it excels at enabling complex multi-model workloads . At its core, Substrate is 1) a collection of cutting-edge AI models – tuned for optimum performance, and 2) a set of composable APIs for relating these models to each other. We believe having both of these components in one... See more
Substrate is an AI inference platform. In particular, it excels at enabling complex multi-model workloads . At its core, Substrate is 1) a collection of cutting-edge AI models – tuned for optimum performance, and 2) a set of composable APIs for relating these models to each other. We believe having both of these components in one... See more
Nextra: the next docs builder
Humans are bad at coming up with search queries. Humans are good at incrementally narrowing down options with a series of filters, and pointing where they want to go next. This seems obvious, but we keep building interfaces for finding information that look more like Google Search and less like a map.
All information tools have to give users some... See more
All information tools have to give users some... See more
![Thumbnail of Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]](https://shortwaveimages.com/proxy/https%3A%2F%2Fsubstackcdn.com%2Fimage%2Ffetch%2Fw_2912%2Cc_limit%2Cf_auto%2Cq_auto%3Agood%2Cfl_progressive%3Asteep%2Fhttps%253A%252F%252Fsubstack-post-media.s3.amazonaws.com%252Fpublic%252Fimages%252F949e68ed-9f0c-47c2-9f12-38155122e288_2156x1212.png)