GitHub - michaelfeil/infinity: Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of sentence-transformer models and frameworks.
Together AI – The AI Acceleration Cloud - Fast Inference, Fine-Tuning & Training
together.ai
Workers AI? It’s another building block that we’re adding to our developer platform - one that helps developers run well-known AI models on serverless GPUs, all on Cloudflare’s trusted global network. As one of the latest additions to our developer platform, it works seamlessly with Workers + Pages, but to make it truly accessible, we’ve made it pl... See more
Phil Wittig • Workers AI: serverless GPU-powered inference on Cloudflare’s global network
Koyeb is a developer-friendly serverless platform designed to let businesses easily deploy reliable and scalable applications globally. The platform has been created by Cloud Computing Veterans and is financially backed by industry leaders.
Koyeb allows you to deploy all kind of services including full web applications, APIs, and background workers.
... See more
Koyeb allows you to deploy all kind of services including full web applications, APIs, and background workers.
... See more
Introduction
Mem0: The Memory Layer for Personalized AI
Mem0 provides a smart, self-improving memory layer for Large Language Models, enabling personalized AI experiences across applications.
Mem0 provides a smart, self-improving memory layer for Large Language Models, enabling personalized AI experiences across applications.
Note: The Mem0 repository now also includes the Embedchain project. We continue to maintain and support Embedchain ❤️. You can find the Embedchain codebase in the embedchai... See more
GitHub - mem0ai/mem0: The memory layer for Personalized AI
Overview
MaxText is a high performance , highly scalable , open-source LLM written in pure Python/Jax and targeting Google Cloud TPUs and GPUs for training and inference . MaxText achieves high MFUs and scales from single host to very large clusters while staying simple and "optimization-free" thanks to the power of Jax and the XLA compiler.
MaxText... See more
MaxText is a high performance , highly scalable , open-source LLM written in pure Python/Jax and targeting Google Cloud TPUs and GPUs for training and inference . MaxText achieves high MFUs and scales from single host to very large clusters while staying simple and "optimization-free" thanks to the power of Jax and the XLA compiler.
MaxText... See more