LLMs
A solution is to self-host an open-sourced or custom fine-tuned LLM. Opting for a self-hosted model can reduce costs dramatically - but with additional development time, maintenance overhead, and possible performance implications. Considering self-hosted solutions requires weighing these different trade-offs carefully.
Developing Rapidly with Generative AI
“I think a lot of people obviously want to talk about the sexy kind of new consumer applications. I would tell you that I think that the earliest and most significant effect that AI is going to have on our company is actually going to be as it relates to our developer productivity. Some of the tools that we’re seeing are going to allow our devs to... See more
Adam Huda • The Transformative Power of Generative AI in Software Development: Lessons from Uber's Tech-Wide Hackathon
Document search and synthesis
Scores of organizations want to harness generative AI so employees can easily find the most relevant documents through improved search results and summaries. For example, your organization can reduce the time it takes employees to find answers to common HR- and process-related questions. Internal manuals and sites are... See more
Scores of organizations want to harness generative AI so employees can easily find the most relevant documents through improved search results and summaries. For example, your organization can reduce the time it takes employees to find answers to common HR- and process-related questions. Internal manuals and sites are... See more
Donna Schut • The Prompt: Takeaways from hundreds of conversations about generative AI - part 1 | Google Cloud Blog
To train LLMs, you need data that is:
Large — Sufficiently large LMs require trillions of tokens.
Clean — Noisy data reduces performance.
Diverse — Data should come from different sources and different knowledge bases.
What does clean data look like?
You can de-duplicate data with simple heuristics. The most basic would be removing any exact duplicates... See more
Large — Sufficiently large LMs require trillions of tokens.
Clean — Noisy data reduces performance.
Diverse — Data should come from different sources and different knowledge bases.
What does clean data look like?
You can de-duplicate data with simple heuristics. The most basic would be removing any exact duplicates... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Take a look at our official page for user documentation and examples: langtest.org
Key Features
Key Features
- Generate and execute more than 50 distinct types of tests only with 1 line of code
- Test all aspects of model quality: robustness, bias, representation, fairness and accuracy.
- Automatically augment training data based on test results (for select models)
- Sup
GitHub - BrunoScaglione/langtest: Deliver safe & effective language models
The next-generation command line.
The source of truth for your team’s secrets, scripts, and SSH credentials.
The source of truth for your team’s secrets, scripts, and SSH credentials.
Fig
The way that most RLHF is done to date has the entire response from a language model get an associated score. To anyone with an RL background, this is disappointing, because it limits the ability for RL methods to make connections about the value of each sub-component of text. Futures have been pointed to where this multi-step optimization comes at... See more
Nathan Lambert • The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data
Study finds RLHF reduces LLM creativity and output variety : A new research paper posted in /r/LocalLLaMA shows that while alignment techniques like RLHF reduce toxic and biased content, they also limit the creativity of large language models, even in contexts unrelated to safety.
