LLMs
Take a look at our official page for user documentation and examples: langtest.org
Key Features
Key Features
- Generate and execute more than 50 distinct types of tests only with 1 line of code
- Test all aspects of model quality: robustness, bias, representation, fairness and accuracy.
- Automatically augment training data based on test results (for select models)
- Sup
GitHub - BrunoScaglione/langtest: Deliver safe & effective language models
Announcing Together Inference Engine – the fastest inference available
November 13, 2023・By Together
The Together Inference Engine is multiple times faster than any other inference service, with 117 tokens per second on Llama-2-70B-Chat and 171 tokens per second on Llama-2-13B-Chat
Today we are announcing Together Inference Engine, the world’s... See more
November 13, 2023・By Together
The Together Inference Engine is multiple times faster than any other inference service, with 117 tokens per second on Llama-2-70B-Chat and 171 tokens per second on Llama-2-13B-Chat
Today we are announcing Together Inference Engine, the world’s... See more
Announcing Together Inference Engine – the fastest inference available
What is Substrate?
Substrate is an AI inference platform. In particular, it excels at enabling complex multi-model workloads . At its core, Substrate is 1) a collection of cutting-edge AI models – tuned for optimum performance, and 2) a set of composable APIs for relating these models to each other. We believe having both of these components in one... See more
Substrate is an AI inference platform. In particular, it excels at enabling complex multi-model workloads . At its core, Substrate is 1) a collection of cutting-edge AI models – tuned for optimum performance, and 2) a set of composable APIs for relating these models to each other. We believe having both of these components in one... See more
Nextra: the next docs builder
The context size of the input is too small for when you want to analyse CSV's with 1000's of rows and embedding doesn't really work because it loses context.
r/LLMDevs - Reddit
When we deliver a model we make sure we don't reach X seconds of latency in our API. Before even going into performance of LLMs for classification, I can tell you that with the current available tech they are just infeasible.
Reply
reply
LinuxSpinach
•
5h ago
^ this. And especially classification as a task, because businesses don’t want to pay llm... See more
Reply
reply
LinuxSpinach
•
5h ago
^ this. And especially classification as a task, because businesses don’t want to pay llm... See more
r/MachineLearning - Reddit
We're doing NER on hundreds of millions of documents in a specialised niche. LLMs are terrible for this. Slow, expensive and horrifyingly inaccurate. Even with agents, pydantic parsing and the like. Supervised methods are the way to go. Hell, I'd take an old school rule based approach over LLMs for this.
For the deployment side of things, we found that the performance of our training process was quite slow, especially when it gets into these large language models and when you train from scratch. MosaicML offers what's called programmatic optimization, which is not so much on the hardware side of things, but rather on the algorithmic side. Can you... See more
CB Insights • 2024 Tech Trends
Source: CB Insights Report
- Multiple indices. Splitting the document corpus up into multiple indices and then routing queries based on some criteria. This means that the search is over a much smaller set of documents rather than the entire dataset. Again, it is not always useful, but it can be helpful for certain datasets. The same approach works with the LLMs themselves.
Matt Rickard • Improving RAG: Strategies
A solution is to self-host an open-sourced or custom fine-tuned LLM. Opting for a self-hosted model can reduce costs dramatically - but with additional development time, maintenance overhead, and possible performance implications. Considering self-hosted solutions requires weighing these different trade-offs carefully.
