LLMs
Matei Zaharia, Omar Khattab, Lingjiao Chen, et al. • The Shift From Models to Compound AI Systems
Unlike consumers, enterprises want control over how their data is used and shared with companies, including the providers of AI software. Enterprises have spent a lot effort in consolidating data from different sources and bringing them in-house (this article Partner integrations + System of Intelligence: Today’s deepest Moat by fellow Medium... See more
AI Startup Trends: Insights from Y Combinator’s Latest Batch
MLServer aims to provide an easy way to start serving your machine learning models through a REST and gRPC interface, fully compliant with KFServing's V2 Dataplane spec. Watch a quick video introducing the project here.
- Multi-model serving, letting users run multiple models within the same process.
- Ability to run inference in parallel for vertical
GitHub - SeldonIO/MLServer: An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
To train LLMs, you need data that is:
Large — Sufficiently large LMs require trillions of tokens.
Clean — Noisy data reduces performance.
Diverse — Data should come from different sources and different knowledge bases.
What does clean data look like?
You can de-duplicate data with simple heuristics. The most basic would be removing any exact duplicates... See more
Large — Sufficiently large LMs require trillions of tokens.
Clean — Noisy data reduces performance.
Diverse — Data should come from different sources and different knowledge bases.
What does clean data look like?
You can de-duplicate data with simple heuristics. The most basic would be removing any exact duplicates... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Setting up the necessary machine learning infrastructure to run these big models is another challenge. We need a dedicated model server for running model inference (using frameworks like Triton oder vLLM), powerful GPUs to run everything robustly, and configurability in our servers to make sure they're high throughput and low latency. Tuning the... See more
Developing Rapidly with Generative AI
API wrappers, general-purpose AI tools and third-party AI tools for big platforms.
API wrappers have a weak moat.
General AI tools try to be the jack-of-all-trades.
Big platforms will eat up small apps by adding similar AI features themselves.
API wrappers have a weak moat.
General AI tools try to be the jack-of-all-trades.
Big platforms will eat up small apps by adding similar AI features themselves.
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
When it comes to identifying where generative AI can make an impact, we dig into challenges that commonly:
- Involve analysis, interpretation, or review of unstructured content (e.g. text) at scale
- Require massive scaling that may be otherwise prohibitive due to limited resources
- Would be challenging for rules-based or traditional ML approaches
Developing Rapidly with Generative AI
A new v0.4.0 release of lm-evaluation-harness is available !
New updates and features include:
New updates and features include:
- Internal refactoring
- Config-based task creation and configuration
- Easier import and sharing of externally-defined task config YAMLs
- Support for Jinja2 prompt design, easy modification of prompts + prompt imports from Promptsource
- More advanced configuration
.png?table=block&id=5cffd615-f82a-4e84-b2ff-4f4e496e2d3e&spaceId=996f2b3b-deaa-4214-aedb-cbc914a1833e&width=1330&userId=&cache=v2)