LLMs

GitHub - SeldonIO/MLServer: An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more

Understanding the Cost of Generative AI Models in Production

microsoft DeepSpeed-FastGen

Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]

Ask HN: What are some actual use cases of AI Agents right now? | Hacker News

Developing Rapidly with Generative AI

Sean Sheng Scaling AI Models Like You Mean It

New models and developer products announced at DevDay