GitHub - OthersideAI/self-operating-computer: A framework to enable multimodal models to operate a computer.
🤖 Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
- Why CrewAI
- Getting Started
- Key Features
- Examples
- Local Open Source Models
- CrewAI x AutoGen x ChatDev
- Contribution
- 💬 CrewAI Discord Community
- Hire Consulting
- Licen
joaomdmoura • GitHub - joaomdmoura/crewAI: Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
Nicolay Gerold added
multimodal-maestro
👋 hello
Multimodal-Maestro gives you more control over large multimodal models to get the outputs you want. With more effective prompting tactics, you can get multimodal models to do tasks you didn't know (or think!) were possible. Curious how it works? Try our HF space!
👋 hello
Multimodal-Maestro gives you more control over large multimodal models to get the outputs you want. With more effective prompting tactics, you can get multimodal models to do tasks you didn't know (or think!) were possible. Curious how it works? Try our HF space!
roboflow • GitHub - roboflow/multimodal-maestro: Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥
Nicolay Gerold added
Abhishek Sivaraman and added
First, here a generalized framework for an autonomous agent :
- Initialize Goal : Define the objective for the AI.
- Task Creation : The AI checks its memory for the last X tasks completed (if any), and then uses it’s objective, and the context of it’s recently completed tasks, to generate a list of new tasks.
- Task Execution : The AI executes the ta
Matt Schlicht • The Complete Beginners Guide To Autonomous Agents
Darren LI added
emerging of LLMs not as a chatbot, but the kernel process of a new Operating System. E.g. today it orchestrates:
- Input & Output across modalities (text, audio, vision)
- Code interpreter, ability to write & run programs
- Browser / internet access
- Embeddings database for files and internal memory storage & retrieval
- Input & Output across modalities (text, audio, vision)
- Code interpreter, ability to write & run programs
- Browser / internet access
- Embeddings database for files and internal memory storage & retrieval
Andrej Karpathy • Tweet
Darren LI added
The two most important problems (at least how I am thinking about them currently), are:
Finding a rigorous scientific framework for how different agent skills, personalities, and instructions combine to be most capable for different problems (think of this as social management science for AI agents) .
Figuring out how you formally validate and verify... See more
Finding a rigorous scientific framework for how different agent skills, personalities, and instructions combine to be most capable for different problems (think of this as social management science for AI agents) .
Figuring out how you formally validate and verify... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Nicolay Gerold added