Recently, there's been a lot of buzz around AutoGen, especially if you have been following AI newsfeeds and channels. Experts and innovators have ear marked multi-agent frameworks as the potential next leap in the application of AI. So, let's dive into what this is all about.
WHAT ARE AGENTS?
An Agent is essentially a piece of Python code that defines its configuration. Typically, it includes a source LLM (Large Language Model), which could be a chatbot or one fine-tuned with programming examples, and a prompt.
The prompt provides initial instructions to the Agent, such as:
"You are a project planner. You receive work requests, break them down into small, manageable tasks, monitor the progress of each task, and ensure they are all completed successfully."
Agents can be equipped with various tools. These tools might include the ability to execute shell scripts, run Python code, or perform specific actions like calling a web service that returns structured data. Tools can help overcome limitations like outdated training data by executing live internet searches.
An Agent can receive a message (from a human or another Agent) and then perform work. Agents can operate individually or in groups, utilising various orchestration modes. They send and receive messages, enabling coordination between the Agent, user, and optionally other Agents.
WHY IS THIS BENEFICIAL?
Challenges and Limits of LLMs
No single LLM is perfect. While recent chatbots excel at understanding instructions and newer code models reliably produce usable code, no one model excels at every task. This might change over time, but there are inherent technical challenges.
LLM attention is crucial for good output, but large context windows are expensive, slow, and have computational limits. Attention computation scales quadratically with context size:
- If context = 4, then computation = 16
- If context = 6, then computation = 36
- If context = 10, then computation = 100
The unit of computation is referred to as a FLOP. Each GPU variant can perform a certain number of FLOPs/second, which determines its inference speed.
Implementing Agent-Based Architecture for Efficient LLM Usage
To get useful output from LLMs, key questions and data must be within the attention window. Large context models are available but are expensive and slower due to the required FLOPs. Cheaper models might respond faster but have smaller attention windows.
Agents allow us to split these attention requirements. For example, we can define a Planner Agent using GPT-4o (for better instructions but at a higher cost) and a Worker Agent using GPT-4o-mini (cheaper and faster). The Planner acts as an interface between the human task and the Worker, breaking down requests into small, self-contained tasks and ensuring their successful completion.
In this setup, the Planner might be asked to "write code to plot stock market performance for stock ticker symbols provided by the user." The Planner could split this into tasks like:
- Write code to identify the stock ticker symbol for a given company.
- Write code to retrieve historical stock prices for a given symbol.
- Write code to plot data on a chart.
- Test the code using the symbol AAPL for 2024.
The Worker Agent is more likely to successfully implement each sub-task without hallucinating because the key requirements will be within its context window during token generation. Additionally, the Planner Agent tests each task, sending any issues back to the Worker Agent. Since each task is small, the Worker Agent can resolve problems with both the code and the Planner's feedback in its context.
INTRODUCING AUTOGEN
AutoGen is an open-source framework from Microsoft for programming and managing agent workflows. It provides base classes to define Agents and their behaviors and orchestration methods to control Agent-Agent and Agent-Human interactions.
SUMMARY
This is just a brief overview of Agents and AutoGen. Analytium is currently working on an Agent Flow to perform feature identification and extraction, to automatically convert configuration data from one platform to another, validate the conversion (providing corrective feedback if necessary), and then test and document the results. If you are interested, get in touch, and we’ll share more.
INTERESTED IN LEARNING MORE?
Book a Free 30-Minute Consultation
Your business' data has potential and it can reach new heights with a Free 30-Minute Consultation from Analytium! Book a meeting with our expert, Chris Wilcox, and get personalised insights into optimising your data strategies.
Thank you for considering Analytium. We look forward to helping you achieve your data-driven goals. Click below to schedule your consultation and start transforming your data.
During The Call, You Can Expect:
- A brief analysis of your current data challenges
- Recommendations tailored to your business needs
- An overview of how Analytium’s solutions can drive your success
September 4, 2024