LLMs
To do this, we employ a technique known as AI-assisted evaluation, alongside traditional metrics for measuring performance. This helps us pick the prompts that lead to better quality outputs, making the end product more appealing to users. AI-assisted evaluation uses best-in-class LLMs (like GPT-4) to automatically critique how well the AI's... See more
Developing Rapidly with Generative AI
Ensuring availability during peak traffic by maintaining all GPU instance types could lead to prohibitively high costs. To avoid the financial strain of idle instances, we implemented a “standby instances” mechanism. Rather than preparing for the maximum potential load, we maintained a calculated number of standby instances that match the... See more
Sean Sheng • Scaling AI Models Like You Mean It
Fine-Tuning for LLM Research by AI Hero
This repo contains the code that will be run inside the container. Alternatively, this code can also be run natively. The container is built and pushed to the repo using Github actions (see below). You can launch the fine tuning job using the examples in the https://github.com/ai-hero/llm-research-examples... See more
This repo contains the code that will be run inside the container. Alternatively, this code can also be run natively. The container is built and pushed to the repo using Github actions (see below). You can launch the fine tuning job using the examples in the https://github.com/ai-hero/llm-research-examples... See more
GitHub - ai-hero/llm-research-fine-tuning
The way that most RLHF is done to date has the entire response from a language model get an associated score. To anyone with an RL background, this is disappointing, because it limits the ability for RL methods to make connections about the value of each sub-component of text. Futures have been pointed to where this multi-step optimization comes at... See more
Nathan Lambert • The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data
However, a key risk with several of these startups is the potential lack of a long-term moat. It is difficult to read too much into it given the stage of these startups and the limited public information available but it’s not difficult to poke holes at their long term defensibility. For example:
- If a startup is built on the premise of taking base
Viggy Balagopalakrishnan • AI Startup Trends: Insights from Y Combinator’s Latest Batch
GPT-4 Turbo performs better than our previous models on tasks that require the careful following of instructions, such as generating specific formats (e.g., “always respond in XML”). It also supports our new JSON mode, which ensures the model will respond with valid JSON. The new API parameter response_format enables the model to constrain its... See more
New models and developer products announced at DevDay
My $0.02 is that a lot of the future research/work there will be figuring out how to identify effective sub-graphs to provide additional context, to avoid having to pass in the entire graph. As well as trying to identify ontology-less structures in real-time, which includes NER and RE, as well as named entity/relationship... See more
r/MachineLearning - Reddit
GPT-4 Turbo can accept images as inputs in the Chat Completions API, enabling use cases such as generating captions, analyzing real world images in detail, and reading documents with figures. For example, BeMyEyes uses this technology to help people who are blind or have low vision with daily tasks like identifying a product or navigating a store.... See more
New models and developer products announced at DevDay
The multiple cantilevered AI overhangs:
Compute overhang. We have much more compute than we are using. Scale can go much further.
Idea overhang. There are many obvious research ideas and combinations of ideas that haven’t been tried in earnest yet.
Capability overhang. Even if we stopped all research now, it would take ten years to digest the new... See more
Compute overhang. We have much more compute than we are using. Scale can go much further.
Idea overhang. There are many obvious research ideas and combinations of ideas that haven’t been tried in earnest yet.
Capability overhang. Even if we stopped all research now, it would take ten years to digest the new... See more