Autonomous Agents show immense potential in solving complex problems, but they can also be brittle, prone to hallucinations, expensive, and slow to run. Today, let's explore how to automate the "Dependencies Upgrade Process" for your product team using CrewAI then Langgraph. Typically, a software engineer handling this task would need to visit changelog webpages, review changes, and coordinate with the product manager to create backlog stories for upcoming sprints. These tasks are often referred to as chores because they are time-consuming, unchallenging, and generally disliked. With agentic workflow, we can streamline and automate these tedious processes, saving time and effort while allowing engineers to focus on more engaging and impactful work.
Large Language Models (LLMs) are equipped with advanced reasoning capabilities and tools for task execution, making them ideal for agentic workflows. These workflows leverage LLMs to plan and execute complex tasks, streamlining automation efforts and ensuring efficient progress.
There are two primary approaches to getting started with agentic workflows. If you are a domain expert, or working with one, it's best to start with workflow automation. Once the flow is defined, you can integrate LLMs into specific workflow steps to enhance intelligence or simplify complex conditional reasoning. In scenarios where projects lack resources or a concrete plan, initiating the process with an autonomous agent can be invaluable. By observing how these agents perform tasks in successful scenarios, you can map out the necessary workflow steps.
After mapping the workflow steps, the next step is to refine and optimize the process. By establishing stricter guidelines and constraints, you can significantly reduce issues such as hallucinations and unnecessary costs, while also improving overall performance. This approach ensures that the workflows are not only efficient but also reliable and cost-effective.
In this article, we will demonstrate how to automate the dependencies upgrade task using the second approach. Starting with CrewAI, our goal is to develop an efficient, predictable agent that can be integrated into CI/CD pipelines to streamline engineering work with Langgraph. This "poor-man's Devin" approach will showcase how to effectively implement agentic workflows in practical scenarios.
We will begin with two agents: a Product Manager and a Developer, utilizing the Hierarchical Agents process from CrewAI. The Product Manager will orchestrate tasks and delegate them to the Developer.
The Developer will employ two tools:
If you're new to CrewAI, here are some key concepts to get you started:
These foundational elements work together to create a robust and efficient workflow automation system.
Creating an agent involves defining its role, goal, and backstory. These attributes allow CrewAI to reuse prompt segments effectively, making the agent more efficient and consistent in its tasks.
class ProductTeamAgents:
def developer_agent(self):
return Agent(
role="Developer",
goal=dedent("""\
Analysising code base, plan and decide where code change
is needed or not."""),
backstory=dedent("""\
As senior software developer working for big tech company,
you are specialize in understanding the code base, its dependencies,
and make development decision per product requirements."""),
tools=[
Changelog.latest_changes,
Repo.read_repo,
],
allow_delegation=False,
llm=self.llm,
verbose=True
)
Developer Agent Example
To enable agents to perform their tasks, we equip them with tools. For the dependencies upgrade objective, we provide the Developer agent with tools to fetch the latest changelog and read repository files. To simplify tool creation, we use the Langchain tool decorator in this example.
class Changelog():
@staticmethod
@tool
@agitool(name='latest_changes')
def latest_changes(url: str):
"""
Get latest changes from changelog url.
:param str url: The url of the web page
"""
url = url.replace('"', '')
with sync_playwright() as playwright:
chromium = playwright.chromium
browser = chromium.launch(headless=False)
context = browser.new_context()
page = context.new_page()
page.goto(url) # Navigate to the old url
html = page.content()
soup = BeautifulSoup(html, 'html.parser')
releases = []
for ele in soup.find_all('section'):
title = ele.find('h2').text
[package, version] = title.split('==')
body = ele.find(lambda tag: tag.name=='div' and tag.has_attr('data-test-selector') and tag['data-test-selector']=="body-content")
changes = []
for mes_ele in body.find_all('p'):
changes.append(Changelog.get_pr_info(mes_ele, context))
releases.append({
'package': package,
'version': version,
'changes': changes,
})
return releases
Tool Example
Next, we define the tasks for the agent. The Developer agent has two primary tasks:
class ProductManagerTasks:
def stories_backlog(self, agent):
return Task(
description=dedent("""\
From the stories drafted by developer, prioritize the stories in backlog.
Focus on identifying high impact update or small wins.
Example of high prioritized stories includes:
- Story which fixes critical bugs
- Story which boost performance
- Story where complexity is 1 and includes easy change such as dependency update.
If the story does not fit above criteria. Feel free to drop it.
"""),
expected_output=dedent(f"""\
Your final summary must include a list of stories from original story. Just need to add priority line item to each stories:
- library: library name, no duplication
- version: latest version of library string
- implementation_node: given then changelog summary from tool, add note
for other developer to implement code change and include example.
Add comment with link to pr_commit.
- implementation_instruction: include files which need to implement change,
and clear instruction on how to implement change.
- priority: rank from 0 to 5, with 5 as hightest
- complexity: using scrum story points
"""),
agent=agent,
)
The Product Manager agent then reviews this information and creates backlog stories accordingly. It's crucial to have clear and explicit task prompts to ensure the agents perform their duties effectively and accurately. This structured approach ensures that each agent operates with a well-defined purpose, enhancing overall workflow efficiency.
Running the workflow with a Langchain release changelog and a Python project utilizing Langchain for a chatbot, we obtain the following output. To get a comprehensive view, we need to run these agents a few times.
Here are the 3 checks we want to perform:
The agents successfully performed the tasks and produced the required output once in a while. The output format includes the dependency name, version, implementation note, implementation instruction, priority, and complexity. The library and version information are accurate. However, the instructions are not particularly well-crafted.
Agents Output from running CrewAI hierarchical process
Workflow Visualisation
Let's examine one of the LLM calls to understand CrewAI's approach to agent functionality.
Log Inspection
With this information, you should feel more comfortable with using agents. At the end of the day, they are just LLM calls. However, avoid manually coding LLM calls to build your agent from scratch; there are easy optimizations to make the process more efficient.
Bi-directional to Uni-directional communication
Bi-directional communication is beneficial in scenarios that require creative problem-solving but is less suited for automation tasks. According to CrewAI documentation, the hierarchical process "simulates traditional organizational hierarchies," but this structure is not the most efficient for automation, contrary to what CrewAI's marketing material suggests.
For automation, uni-directional communication is more effective, similar to a manufacturing process. This approach minimizes redundancy, streamlines task execution, and enhances overall efficiency, ensuring that agents perform their tasks with minimal back-and-forth communication.
Instead of using a hierarchical process, we can have each agent perform a specific task, complete it, and then pass the findings to the next agent without looping back. In our case, we can split the Developer agent into a Lead Developer Agent and a Senior Developer Agent. The Lead Developer Agent specializes in analyzing the changelog, summarizing it, and passing the findings to the Senior Developer Agent. The Senior Developer Agent then performs source code analysis and drafts backlog stories before passing them to the Product Manager Agent, who finalizes the stories with priority.
CrewAI Sequential Process
Note: Agent names are conceptual; this demonstration does not imply that the agents have the same capabilities as real humans.
This reduction in communication loops can also help minimize hallucinations. Instead of leaving the coordinator agent to decide which information to pass to other agents, we now explicitly pass the output from one agent to the next. This preserves the integrity of the information and ensures a smoother, more accurate workflow.
Upon performing a sanity check on the output, the format is correct. The identified dependencies and versions that need updating are accurate, and the instructions provided are clearer and more relevant. Running these agents multiple times resulted in a significantly higher success rate compared to hierarchical agents.
Workflow Visualisation
This change also reduce cost significantly, to only 22 cents average, and made the process three times faster. Reviewing the logs revealed that ReactAgent made extra LLM calls per task, which are unnecessary for our automation. By providing explicit instructions, we can further streamline the process and enhance efficiency.
Upon examining the logs, we noticed that the "Thought - Action - Observation" sequence is not included in our Agent and Task descriptions. This type of prompt, known as a ReAct Agent, uses reflection to guide its actions.
ReAct (Reason then Act) Agent
There are other types of agents as well, such as Reflexion Agents, which are inspired by the actor-critic method from reinforcement learning. In this method, another LLM evaluates the agent's actions and provides feedback for further improvement.
At this stage, you need to evaluate whether reflection or thought process is necessary for your automation tasks. If it is not, explicitly encoding the task flow and eliminating extra reflection steps can optimize the process, making it more efficient and cost-effective.
Learning from the sequential process, we can create a workflow graph with tasks and tools to optimize the execution. Our goal is to ensure the Agentic Workflow predictably runs exactly 6 LLM calls on each execution. Langgraph provides a low-level toolkit to achieve this precision.
Automation Flow with Langgraph
To reduce prompt effort, we can copy the prompt from CrewAI logs and add it to the agent node. The execution loop is determined by conditional edges. Without using a ReAct Agent, we can explicitly instruct the LLM to provide an end word, indicating that it has enough information to proceed to the next agent node.
class ProductTeamAgents:
@agent(name='Lead Developer Agent')
def lead_developer_agent(self, state: MessagesState):
llm = Models.get_latest().bind_tools([Changelog.latest_changes])
messages = state['messages']
messages = [
SystemMessage(
content="""
You are Lead Developer from big tech.
Your goal is analysising code base, plan and decide where code change is needed or not.
Analyze the given library changelog.
Focus on identifying important updates which requires
code change from the application which are consuming
this library. Example of important update includes:
- Function arguments and key arguments changes
- Function return changes
Keep in mind, attention to detail is crucial for a comprehensive story.
Your final summary must include a list of items and a title. You must explicitely
ended with "END TURN" once you finish the summary.
Title: Changelog Summary.
Each item includes.
- library: library name, no duplication
- version: latest version of library string
- implementation_note: given then changelog summary from tool, add note
for other developer to implement code change and include example.
Add comment with link to pr_commit.
"""
),
] + messages
if messages[-1] and isinstance(messages[-1], ToolMessage):
messages[-1].content = json.dumps(messages[-1].content)
response = llm.invoke(messages)
return {"messages": [response]}
Running the agents in this manner, we see consistent performance with exactly 6 LLM calls each time. The execution time is now consistently within 1 minute, and the cost is reduced to 26 cents.
Further examination of the logs reveals unnecessary information being included in the prompt context. By eliminating this extraneous information, we can streamline the process even more efficiently.
Let's explore how we can refine the prompt context to optimize performance further.
One aspect of the agentic workflow we haven't touched on is the agent's ability to access information via memory. There are two types of memory in agentic workflows:
Agent Memory
For simplicity, we will leave out memory in CrewAI and long-term memory in Langgraph. In the Langgraph execution case, we have access to short-term memory via State.
To optimize agent context through memory access, there are several notable methods, including Chunking and RAG (Retrieval Augmented Generation). By examining the logs of an LLM call, we can identify unnecessary intermediate outputs from one agent before passing the short-term memory to the next agent.
Adding new node to remove intermedia messages
We can achieve this by adding an intermediate node in Langgraph to remove the state messages. With this optimization, we can consistently run the entire workflow at only 16 cents (1/10th of the original cost) and 1/3rd of the original execution time.
This enhancement significantly improves both the cost-efficiency and speed of the workflow, demonstrating the value of efficient memory management in agentic workflows.
To reduce hallucinations, lower costs, and improve speed, consider the following low-hanging fruit strategies:
These techniques effectively optimize multi-agent workflows and are safe to use in any context.
For additional optimization, consider the following suggestions:
By implementing these strategies, you can achieve a more efficient, cost-effective, and faster agentic workflow.
Before experimenting with different models, optimizing prompts, or even hardware, ensure you have implemented the following components to ensure your optimizations work as expected in production.
Evaluation can be statistical, model-based, or a simple sanity check. In our scenarios, this includes code coverage, unit tests, etc. This helps ensure that your workflow is functioning correctly and efficiently.
Observability goes beyond traditional metrics. For LLM (AI applications), leveraging telemetry can enhance your data pipeline. OpenTelemetry is a great, production-proven tool that provides extensive connectivity to achieve this. Implementing robust observability ensures you can monitor, troubleshoot, and optimize your workflows effectively.
Given the non-deterministic nature of LLMs, having human oversight is crucial. This involves having personnel to monitor and troubleshoot the agentic workflow. Empowering your content team to score and label LLM output can further improve the agentic workflow, ensuring higher accuracy and reliability.
By incorporating these components, you can ensure your optimizations are effective and reliable in a production environment.
Agentic workflows are still in their early stages, and we can expect their capabilities and architectures to keep improving with advancements in LLMs and tools. As these systems evolve, it's crucial to ensure that people view agentic systems as empowering tools rather than replacements.
I recently gave a talk at a local meetup (which this article is based on) that evoked a lot of emotions from the audience.
If we are not careful with the social impact and perception, there could be unintended side effects on the valuable work being done with these agents.
Feel free to reach out and join our community if you are interested in agentic workflow development! Let's work together to build a future where agentic systems enhance our capabilities and foster positive change.
Source-code is also available on Github.