LLMs
However, a key risk with several of these startups is the potential lack of a long-term moat. It is difficult to read too much into it given the stage of these startups and the limited public information available but it’s not difficult to poke holes at their long term defensibility. For example:
- If a startup is built on the premise of taking base
AI Startup Trends: Insights from Y Combinator’s Latest Batch
My $0.02 is that a lot of the future research/work there will be figuring out how to identify effective sub-graphs to provide additional context, to avoid having to pass in the entire graph. As well as trying to identify ontology-less structures in real-time, which includes NER and RE, as well as named entity/relationship... See more
r/MachineLearning - Reddit
Langfuse is an open source observability & analytics solution for LLM-based applications. It is mostly geared towards production usage but some users also use it for local development of their LLM applications.
Langfuse is focused on applications built on top of LLMs. Many new abstractions and common best practices evolved recently, e.g. agents,... See more
Langfuse is focused on applications built on top of LLMs. Many new abstractions and common best practices evolved recently, e.g. agents,... See more
langfuse • GitHub - langfuse/langfuse: Open source observability and analytics for LLM applications
Setting up the necessary machine learning infrastructure to run these big models is another challenge. We need a dedicated model server for running model inference (using frameworks like Triton oder vLLM), powerful GPUs to run everything robustly, and configurability in our servers to make sure they're high throughput and low latency. Tuning the... See more
Developing Rapidly with Generative AI
The Gemini API context caching feature is designed to reduce the cost of requests that contain repeat content with high input token counts.
When to use context caching
Context caching is particularly well suited to scenarios where a substantial initial context is referenced repeatedly by shorter requests. Consider using context caching for use cases... See more
When to use context caching
Context caching is particularly well suited to scenarios where a substantial initial context is referenced repeatedly by shorter requests. Consider using context caching for use cases... See more
Context caching guide | Google AI for Developers | Google for Developers
In addition to using our built-in capabilities, you can also define custom actions by making one or more APIs available to the GPT. Like plugins, actions allow GPTs to integrate external data or interact with the real-world. Connect GPTs to databases, plug them into emails, or make them your shopping assistant. For example, you could integrate a... See more
Introducing GPTs
OpenAI is treating its new marketplace seriously now: The brand new GPT store will come with REVENUE SHARING.... (missing in the Plugins launch)
and launching a Stateful Assistants API:
- Persistent Threads (/api/openai/threads)
- Built in Retrieval (chunking etc done for you)
- Code Interpreter (RIP Adv Data Analysis?)
- Speech to Text and Text to... See more
and launching a Stateful Assistants API:
- Persistent Threads (/api/openai/threads)
- Built in Retrieval (chunking etc done for you)
- Code Interpreter (RIP Adv Data Analysis?)
- Speech to Text and Text to... See more
swyx • Tweet
core components of Deep RL that enabled success like AlphaGo: self-play and look-ahead planning.
Self-play is the idea that an agent can improve its gameplay by playing against slightly different versions of itself because it’ll progressively encounter more challenging situations. In the space of LLMs, it is almost certain that the largest portion... See more
Self-play is the idea that an agent can improve its gameplay by playing against slightly different versions of itself because it’ll progressively encounter more challenging situations. In the space of LLMs, it is almost certain that the largest portion... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
These two components might be some of the most important ideas to improve all of AI.
GPT-4 Turbo can accept images as inputs in the Chat Completions API, enabling use cases such as generating captions, analyzing real world images in detail, and reading documents with figures. For example, BeMyEyes uses this technology to help people who are blind or have low vision with daily tasks like identifying a product or navigating a store.... See more