Agentic Workflow

Optimise Cost and Speed of Agentic Workflow - Reversed Engineering Approach

Vuong Ngo•2024-08-03 11:39•

#AgenticWorkflow#AIOps#AIAgents#MLOps

Autonomous Agents show immense potential in solving complex problems, but they can also be brittle, prone to hallucinations, expensive, and slow to run. Today, let's explore how to automate the "Dependencies Upgrade Process" for your product team using CrewAI then Langgraph. Typically, a software engineer handling this task would need to visit changelog webpages, review changes, and coordinate with the product manager to create backlog stories for upcoming sprints. These tasks are often referred to as chores because they are time-consuming, unchallenging, and generally disliked. With agentic workflow, we can streamline and automate these tedious processes, saving time and effort while allowing engineers to focus on more engaging and impactful work.

Table of Content

Why Use Agentic Workflows?
Rule No.1 of Software Development: Keep it Simple, Stupid
Tip No.1: Simplify Communication Flow
Rule No.2 of Software Development: Don't Repeat Yourself
Agent Types
Rule No.3: Divide and Conquer
Tip No.2: Memory Optimisation
What Have We Done So Far?
Essential Components for Optimization
What's next

Why Use Agentic Workflows?

Large Language Models (LLMs) are equipped with advanced reasoning capabilities and tools for task execution, making them ideal for agentic workflows. These workflows leverage LLMs to plan and execute complex tasks, streamlining automation efforts and ensuring efficient progress.

There are two primary approaches to getting started with agentic workflows. If you are a domain expert, or working with one, it's best to start with workflow automation. Once the flow is defined, you can integrate LLMs into specific workflow steps to enhance intelligence or simplify complex conditional reasoning. In scenarios where projects lack resources or a concrete plan, initiating the process with an autonomous agent can be invaluable. By observing how these agents perform tasks in successful scenarios, you can map out the necessary workflow steps.

After mapping the workflow steps, the next step is to refine and optimize the process. By establishing stricter guidelines and constraints, you can significantly reduce issues such as hallucinations and unnecessary costs, while also improving overall performance. This approach ensures that the workflows are not only efficient but also reliable and cost-effective.

In this article, we will demonstrate how to automate the dependencies upgrade task using the second approach. Starting with CrewAI, our goal is to develop an efficient, predictable agent that can be integrated into CI/CD pipelines to streamline engineering work with Langgraph. This "poor-man's Devin" approach will showcase how to effectively implement agentic workflows in practical scenarios.

Tools

Agent Frameworks: CrewAI, Langgraph
LLM: OpenAI GTP-4o
Observability: Agiflow.io
Source-code is also available on Github.

Rule No.1 of Software Development: Keep it Simple, Stupid

We will begin with two agents: a Product Manager and a Developer, utilizing the Hierarchical Agents process from CrewAI. The Product Manager will orchestrate tasks and delegate them to the Developer.

The Developer will employ two tools:

One tool to fetch the changelog from the URL.
Another tool to read repository files to determine if dependencies need updating. Based on the Developer's findings, the Product Manager will then prioritize backlog stories, ensuring that the most critical updates are addressed efficiently. This structured approach streamlines the workflow, enhancing productivity and maintaining focus on essential tasks.

Introduction to CrewAI Concepts

If you're new to CrewAI, here are some key concepts to get you started:

Agent: Think of this as a reusable user profile designed to execute specific roles within the workflow.
Task: This is a job description that outlines what the LLM needs to perform.
Tool: A function linked to an agent that allows interaction with the environment, enabling the execution of tasks.

These foundational elements work together to create a robust and efficient workflow automation system.

Coding with CrewAI

Creating an agent involves defining its role, goal, and backstory. These attributes allow CrewAI to reuse prompt segments effectively, making the agent more efficient and consistent in its tasks.


class ProductTeamAgents:
    def developer_agent(self):
        return Agent(
            role="Developer",
            goal=dedent("""\
                Analysising code base, plan and decide where code change
                is needed or not."""),
            backstory=dedent("""\
                As senior software developer working for big tech company,
                you are specialize in understanding the code base, its dependencies,
                and make development decision per product requirements."""),
            tools=[
                Changelog.latest_changes,
                Repo.read_repo,
            ],
            allow_delegation=False,
            llm=self.llm,
            verbose=True
        )

Developer Agent Example

To enable agents to perform their tasks, we equip them with tools. For the dependencies upgrade objective, we provide the Developer agent with tools to fetch the latest changelog and read repository files. To simplify tool creation, we use the Langchain tool decorator in this example.

class Changelog():

    @staticmethod
    @tool
    @agitool(name='latest_changes')
    def latest_changes(url: str):
        """
            Get latest changes from changelog url.
            :param str url: The url of the web page
        """
        url = url.replace('"', '')
        with sync_playwright() as playwright:
            chromium = playwright.chromium
            browser = chromium.launch(headless=False)
            context = browser.new_context()
            page = context.new_page()
            page.goto(url)  # Navigate to the old url
            html = page.content()

            soup = BeautifulSoup(html, 'html.parser')
            releases = []
            for ele in soup.find_all('section'):
                title = ele.find('h2').text
                [package, version] = title.split('==')
                body = ele.find(lambda tag: tag.name=='div' and tag.has_attr('data-test-selector') and tag['data-test-selector']=="body-content")
                changes = []
                for mes_ele in body.find_all('p'):
                    changes.append(Changelog.get_pr_info(mes_ele, context))

                releases.append({
                    'package': package,
                    'version': version,
                    'changes': changes,
                })

        return releases

Tool Example

Next, we define the tasks for the agent. The Developer agent has two primary tasks:

Retrieve and summarize changelog information.
Perform source code analysis to determine if dependencies need updating and identify any necessary code changes.

class ProductManagerTasks:
    def stories_backlog(self, agent):
        return Task(
            description=dedent("""\
                From the stories drafted by developer, prioritize the stories in backlog.
                Focus on identifying high impact update or small wins.
                Example of high prioritized stories includes:
                - Story which fixes critical bugs
                - Story which boost performance
                - Story where complexity is 1 and includes easy change such as dependency update.
                If the story does not fit above criteria. Feel free to drop it.
                """),
            expected_output=dedent(f"""\
                Your final summary must include a list of stories from original story. Just need to add priority line item to each stories:
                - library: library name, no duplication
                - version: latest version of library string
                - implementation_node: given then changelog summary from tool, add note
                for other developer to implement code change and include example.
                Add comment with link to pr_commit.
                - implementation_instruction: include files which need to implement change,
                and clear instruction on how to implement change.
                - priority: rank from 0 to 5, with 5 as hightest
                - complexity: using scrum story points
                """),
            agent=agent,
        )

The Product Manager agent then reviews this information and creates backlog stories accordingly. It's crucial to have clear and explicit task prompts to ensure the agents perform their duties effectively and accurately. This structured approach ensures that each agent operates with a well-defined purpose, enhancing overall workflow efficiency.

Running the Workflow

Running the workflow with a Langchain release changelog and a Python project utilizing Langchain for a chatbot, we obtain the following output. To get a comprehensive view, we need to run these agents a few times.

Here are the 3 checks we want to perform:

Sanity and Hallucination Check
Cost and Execution Time
Task Coordination and Message Passing

Results and Observations

The agents successfully performed the tasks and produced the required output once in a while. The output format includes the dependency name, version, implementation note, implementation instruction, priority, and complexity. The library and version information are accurate. However, the instructions are not particularly well-crafted.

Agents Output from running CrewAI hierarchical process

Cost and Performance

Average Run Cost: Approximately $2 for three tasks, which is quite expensive.
Total Execution Time: 3 minutes, which is relatively slow for automation.

Workflow Visualization

Workflow Visualisation

The agents communicate in a tree-like structure as expected from hierarchical agents.
The Developer agent redundantly performs some tasks.
Bidirectional communication between the Developer and Product Manager agents appears unnecessary.

Analysis of LLM Calls

Let's examine one of the LLM calls to understand CrewAI's approach to agent functionality.

Log Inspection

Agent Description: Added to the first section of the user prompt from the Agent class.
Tools Description: Automatically included by CrewAI.
Thought, Action, Observation (ReAct Prompt): This technique involves reasoning followed by action and observation. We'll explore this technique further later.
Task Description: Added next.
Output from Previous Task: Appended at the end of the user prompt for execution.

With this information, you should feel more comfortable with using agents. At the end of the day, they are just LLM calls. However, avoid manually coding LLM calls to build your agent from scratch; there are easy optimizations to make the process more efficient.

Tip No.1: Simplify Communication Flow

Bi-directional to Uni-directional communication

Bi-directional communication is beneficial in scenarios that require creative problem-solving but is less suited for automation tasks. According to CrewAI documentation, the hierarchical process "simulates traditional organizational hierarchies," but this structure is not the most efficient for automation, contrary to what CrewAI's marketing material suggests.

For automation, uni-directional communication is more effective, similar to a manufacturing process. This approach minimizes redundancy, streamlines task execution, and enhances overall efficiency, ensuring that agents perform their tasks with minimal back-and-forth communication.

Rule No.2 of Software Development: Don't Repeat Yourself

Instead of using a hierarchical process, we can have each agent perform a specific task, complete it, and then pass the findings to the next agent without looping back. In our case, we can split the Developer agent into a Lead Developer Agent and a Senior Developer Agent. The Lead Developer Agent specializes in analyzing the changelog, summarizing it, and passing the findings to the Senior Developer Agent. The Senior Developer Agent then performs source code analysis and drafts backlog stories before passing them to the Product Manager Agent, who finalizes the stories with priority.

CrewAI Sequential Process

Note: Agent names are conceptual; this demonstration does not imply that the agents have the same capabilities as real humans.

This reduction in communication loops can also help minimize hallucinations. Instead of leaving the coordinator agent to decide which information to pass to other agents, we now explicitly pass the output from one agent to the next. This preserves the integrity of the information and ensures a smoother, more accurate workflow.

Let's See It in Action

Upon performing a sanity check on the output, the format is correct. The identified dependencies and versions that need updating are accurate, and the instructions provided are clearer and more relevant. Running these agents multiple times resulted in a significantly higher success rate compared to hierarchical agents.

Workflow Visualisation

This change also reduce cost significantly, to only 22 cents average, and made the process three times faster. Reviewing the logs revealed that ReactAgent made extra LLM calls per task, which are unnecessary for our automation. By providing explicit instructions, we can further streamline the process and enhance efficiency.

Agent Types

Upon examining the logs, we noticed that the "Thought - Action - Observation" sequence is not included in our Agent and Task descriptions. This type of prompt, known as a ReAct Agent, uses reflection to guide its actions.

ReAct (Reason then Act) Agent

There are other types of agents as well, such as Reflexion Agents, which are inspired by the actor-critic method from reinforcement learning. In this method, another LLM evaluates the agent's actions and provides feedback for further improvement.

At this stage, you need to evaluate whether reflection or thought process is necessary for your automation tasks. If it is not, explicitly encoding the task flow and eliminating extra reflection steps can optimize the process, making it more efficient and cost-effective.

Rule No.3: Divide and Conquer

Learning from the sequential process, we can create a workflow graph with tasks and tools to optimize the execution. Our goal is to ensure the Agentic Workflow predictably runs exactly 6 LLM calls on each execution. Langgraph provides a low-level toolkit to achieve this precision.

Automation Flow with Langgraph

To reduce prompt effort, we can copy the prompt from CrewAI logs and add it to the agent node. The execution loop is determined by conditional edges. Without using a ReAct Agent, we can explicitly instruct the LLM to provide an end word, indicating that it has enough information to proceed to the next agent node.

class ProductTeamAgents:

    @agent(name='Lead Developer Agent')
    def lead_developer_agent(self, state: MessagesState):
        llm = Models.get_latest().bind_tools([Changelog.latest_changes])
        messages = state['messages']
        messages = [
            SystemMessage(
                content="""
                You are Lead Developer from big tech.
                Your goal is analysising code base, plan and decide where code change is needed or not.
                Analyze the given library changelog.
                Focus on identifying important updates which requires
                code change from the application which are consuming
                this library. Example of important update includes:
                - Function arguments and key arguments changes
                - Function return changes
                Keep in mind, attention to detail is crucial for a comprehensive story.
                Your final summary must include a list of items and a title. You must explicitely
                ended with "END TURN" once you finish the summary.
                Title: Changelog Summary.
                Each item includes.
                - library: library name, no duplication
                - version: latest version of library string
                - implementation_note: given then changelog summary from tool, add note
                for other developer to implement code change and include example.
                Add comment with link to pr_commit.
                """
            ),
        ] + messages
        if messages[-1] and isinstance(messages[-1], ToolMessage):
            messages[-1].content = json.dumps(messages[-1].content)
        response = llm.invoke(messages)
        return {"messages": [response]}

Running the agents in this manner, we see consistent performance with exactly 6 LLM calls each time. The execution time is now consistently within 1 minute, and the cost is reduced to 26 cents.

Further examination of the logs reveals unnecessary information being included in the prompt context. By eliminating this extraneous information, we can streamline the process even more efficiently.

Let's explore how we can refine the prompt context to optimize performance further.

Tip No.2: Memory Optimisation

One aspect of the agentic workflow we haven't touched on is the agent's ability to access information via memory. There are two types of memory in agentic workflows:

Agent Memory

Short-term memory: Session-based memory where agents have access to the outputs produced within the context of a workflow execution.
Long-term memory: Agents have access to previously executed outputs.

For simplicity, we will leave out memory in CrewAI and long-term memory in Langgraph. In the Langgraph execution case, we have access to short-term memory via State.

To optimize agent context through memory access, there are several notable methods, including Chunking and RAG (Retrieval Augmented Generation). By examining the logs of an LLM call, we can identify unnecessary intermediate outputs from one agent before passing the short-term memory to the next agent.

Adding new node to remove intermedia messages

We can achieve this by adding an intermediate node in Langgraph to remove the state messages. With this optimization, we can consistently run the entire workflow at only 16 cents (1/10th of the original cost) and 1/3rd of the original execution time.

This enhancement significantly improves both the cost-efficiency and speed of the workflow, demonstrating the value of efficient memory management in agentic workflows.

What Have We Done So Far?

To reduce hallucinations, lower costs, and improve speed, consider the following low-hanging fruit strategies:

Remove Unnecessary LLM Calls: By eliminating redundant calls, you can significantly reduce both cost and execution time.
Streamline Prompt Context: By minimizing the context passed to the agent, you further reduce costs and enhance speed.

These techniques effectively optimize multi-agent workflows and are safe to use in any context.

For additional optimization, consider the following suggestions:

Use Cheaper, Smaller Models: For extra cost savings, utilize smaller, more economical models for the agents, which also improves speed.
Use Hardware Optimized for Inference: Instead of popular GPUs, consider using LPUs (e.g., Groq) which can further enhance speed.
Decompose and Parallelize Tasks: Break down single task prompts into multiple smaller prompts and execute them in parallel. This reduces the number of tokens per prompt, boosting performance.

By implementing these strategies, you can achieve a more efficient, cost-effective, and faster agentic workflow.

Essential Components for Optimization

Before experimenting with different models, optimizing prompts, or even hardware, ensure you have implemented the following components to ensure your optimizations work as expected in production.

1. Evaluation

Evaluation can be statistical, model-based, or a simple sanity check. In our scenarios, this includes code coverage, unit tests, etc. This helps ensure that your workflow is functioning correctly and efficiently.

2. Observability

Observability goes beyond traditional metrics. For LLM (AI applications), leveraging telemetry can enhance your data pipeline. OpenTelemetry is a great, production-proven tool that provides extensive connectivity to achieve this. Implementing robust observability ensures you can monitor, troubleshoot, and optimize your workflows effectively.

3. Human-in-the-Loop

Given the non-deterministic nature of LLMs, having human oversight is crucial. This involves having personnel to monitor and troubleshoot the agentic workflow. Empowering your content team to score and label LLM output can further improve the agentic workflow, ensuring higher accuracy and reliability.

By incorporating these components, you can ensure your optimizations are effective and reliable in a production environment.

What's next

Agentic workflows are still in their early stages, and we can expect their capabilities and architectures to keep improving with advancements in LLMs and tools. As these systems evolve, it's crucial to ensure that people view agentic systems as empowering tools rather than replacements.

I recently gave a talk at a local meetup (which this article is based on) that evoked a lot of emotions from the audience.

If we are not careful with the social impact and perception, there could be unintended side effects on the valuable work being done with these agents.

Feel free to reach out and join our community if you are interested in agentic workflow development! Let's work together to build a future where agentic systems enhance our capabilities and foster positive change.

Source-code is also available on Github.

← Back to Blog