Building Multi-Agent Systems with LangGraph — A Comprehensive Guide

Building Multi-Agent Systems with LangGraph — A Comprehensive GuideIntroductionAs agentic systems become increasingly common in the modern world, the complexity associated with commercial applications continues to rise. This heightened complexity prese…


This content originally appeared on Level Up Coding - Medium and was authored by S Sankar

Building Multi-Agent Systems with LangGraph — A Comprehensive Guide

Introduction

As agentic systems become increasingly common in the modern world, the complexity associated with commercial applications continues to rise. This heightened complexity presents a significant challenge when attempting to scale sophisticated single-agent systems. While a single agent can perform complex tasks, the need for specialization, rigorous error checking, and the combination of diverse models quickly reveals the limitations of this monolithic approach.

The solution to scaling complexity and enhancing decision-making lies in adopting multi-agent systems (MAS). This article starts with the basics of MAS, their architecture, and ends with an implementation of a specific, powerful MAS structure: the hierarchical multi-agent system, using the LangGraph framework in Python.

Why Single Agents Struggle to Scale

As sophisticated single-agent systems become increasingly commonplace in the commercial world, the complexity required to handle real-world challenges has grown exponentially. Attempting to scale a single, highly complicated agent often results in significant challenges.

In the same way an organization relies on specialization — having an accountant, a lawyer, and separate technical and commercial teams — agentic systems require specialization. A single generic agent that tries to handle everything lacks the necessary focus, making specialization and decision branching crucial for successful collaboration.

This is where Multi-Agent Systems (MAS) provide the essential solution. MAS offers several critical advantages:

  1. Error Checking and Self-Correction: By having multiple agents, one can supervise and monitor the tasks of another, allowing for correction when mistakes occur, thereby enabling self-correction.
  2. Specialization and Decision Branching: Just as an organization employs specialized roles like accountants, lawyers, and technical staff, agentic systems require specialized agents tailored to specific tasks. Relying on a generic single agent to handle everything hinders efficiency.
  3. Combining Diverse Models: MAS allows you to leverage the strengths of specialized Large Language Models (LLMs). For instance, a coding task might utilize a model that excels at coding, while analysis or requirement gathering might be handled by a different model, like OpenAI’s GPT series.

Classifying Multi-Agent Systems

To move beyond the limitations of a single agent, we must understand how agent teams are structured. According to a survey on multi-agent collaboration mechanisms specific to LLMs, collaboration can be classified in several ways:

Type of Collaboration: Agents can be cooperative (working toward a single shared goal) or competitive (competing to achieve the best performance).
Strategy. Collaboration can be Rule-based, Model-based, or Role-based, where agents are assigned specific duties (e.g., a “reviewer” agent acting as a senior developer, or an “architect” agent).
Structure: Structures can be Centralized, Distributed, or Hierarchical. The crux of our implementation in the second half of this article is structure-based. More specifically, we will implement a hierarchical MAS. But before that, let's see how agents communicate and coordinate.

Dissecting MAS Structures

Several organizational structures dictate how agents communicate and coordinate. It is shown in the middle (Structures) in the above figure.

  1. Network Structure. Agents communicate freely in any direction. While flexible, this structure can easily lead to **chaos** because the roles of the agents are not clearly defined.
  2. Supervisor Structure. A single, defined supervisor coordinates execution among all sub-agents or teams. While this offers nice control, its major drawback is that the supervisor is a “single point of failure”; if it collapses, the entire system breaks down.
  3. Supervisor as a Tool. A slight variation where agents expose their capabilities to the supervisor, and the supervisor treats these capabilities as tools rather than agents.
  4. Hierarchical Structure. This structure mimics an organizational chart, featuring supervisors layered on top of other supervisors and teams. For example, a top-level head (like a CEO or CTO) oversees teams (like the technical and commercial teams), which are themselves potentially led by team leads who report back to the head of the organization. This hierarchical approach is the precise structure implemented in the source material.

If you wish to dive deeper into the world of MAS, there is a nice survey paper here.

Visual Explanation

If you are like me and would like a quick video walkthrough, you may check this video on our channel:

Hands-on Implementation with LangGraph

Moving on to some hands-on exercise, let’s build a blog writing system. We will leverage hierarchical MAS architecture to tackle this problem.

This system features a top-level Supervisor Agent that coordinates execution between two primary specialized teams:

  1. The Researcher Agent Team
  2. The Writing Agent Team

In terms of the implementation itself, we are going to be structuring it into preliminary setup, defining the tools, defining a supervisor template, defining the Agent Teams, and finally implementing the end-to-end graph.

Here is a high-level overview of how that looks:

Here, the web scrapper, note taker, writer, and chart generator are all tools. The supervisor agent, research agent, and writing agent are for a hierarchical structure. Each of the research agent team and writing agent team is, in turn, implemented as a hierarchy of supervisor and helper agents. These are ReAct agents.

Preliminary Setup

Let’s begin with a standard preliminary setup, which involves creating a Conda environment (e.g., Python 3.12) and installing core packages. We will also need OpenAI api key and Tavily API key as these are the 2 external systems.

import getpass
import os

def _set_if_undefined(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"Provide your {var}")


_set_if_undefined("OPENAI_API_KEY")
_set_if_undefined("TAVILY_API_KEY")

With implementation revolving around the LangGraph and LangChain ecosystems, we will need some imports

!pip install -U langgraph langchain_community langchain-tavily langchain_experimental langchain_openai
from typing import Annotated, List, Optional, Dict, Literal

import os

from langchain_community.document_loaders import WebBaseLoader
from langchain_core.tools import tool

from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

from langchain_tavily import TavilySearch

from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.types import Command

from langchain_core.language_models.chat_models import BaseChatModel
from langchain_core.messages import HumanMessage, trim_messages
from typing_extensions import TypedDict

Handoff

A fundamental concept in coordinating MAS with LangGraph is the Handoff. Handoffs occur when one agent passes control to another, specifying the destination agent (the target) and the payload (the information to pass). This critical mechanism is implemented using the `Command` object.

Defining Specialized Tools

For specialization to be effective, agents need access to domain-specific tools. So, lets first define all the tools to be used by the agents:

@tool 
def scrape_webpages(urls: List[str]) -> str:
"""User requests and bs4 to scrape the provided web pages for detailed information"""
loader = WebBaseLoader(urls)
docs = loader.load()
return "\n\n".join(
[
f'<Document name="{doc.metadata.get("title", "")}">\n{doc.page_content}\n</Document>'
for doc in docs
]
)

@tool
def create_outline(
points: Annotated[List[str], "List of main points or sections"],
file_name: Annotated[str, "File path to save the outline"],
) -> Annotated[str, "Path of the saved outline file"]:
"""Create and save an outline"""
# file_to_use = os.path.join(os.getcwd(), "temp", file_name)
file_to_use = "/Users/ssankar/Documents/ai_bites/code/langgraph-cc/temp/" + file_name
with open(file_to_use, "w") as file:
for i, point in enumerate(points):
file.write(f"{i + 1}. {point}\n")

return f"Outline saved to {file_name}"

@tool
def read_document(
file_name: Annotated[str, "File path to read the document from"],
start: Annotated[Optional[int], "The start line. Default is 0"] = None,
end: Annotated[Optional[int], "The end line. Default is None"] = None
):
"""Read the specified document"""
# file_to_use = os.path.join(os.getcwd(), file_name)
file_to_use = "/Users/ssankar/Documents/ai_bites/code/langgraph-cc/temp/" + file_name
with open(file_to_use, "r") as file:
lines = file.readlines()
if start is None:
start = 0
return "\n".join(lines[start:end])

@tool
def write_document(
content: Annotated[str, "Text content to be written to the document"],
file_name: Annotated[str, "File path to save the document"]
):
"""Create and save a text document"""
file_to_use = "/Users/ssankar/Documents/ai_bites/code/langgraph-cc/temp/" + file_name
with open(file_to_use, "w") as file:
file.write(content)
return f"Document saved to {file_name}"

@tool
def edit_document(
file_name: Annotated[str, "File path to save the document"],
insert: Annotated[Dict[int, str], "Dictionary where key is the line number and value is the text to be inserted at the line number"],
):
"""Edit a document by inserting text at specified line numbers"""
file_to_use = os.path.join(os.getcwd(), file_name)
with open(file_to_use, "w") as file:
lines = file.readlines()

sorted_inserts = sorted(inserts.items())

for line_number, text in sorted_inserts:
if 1 <= line_number <= len(lines) + 1:
lines.insert(line_number -1, text + "\n")
else:
return f"Error: line number {line_number} is out of range"

with open(file_name, "w") as file:
file.writelines(lines)

return f"Document edited and saved to {file_name}"


@tool
def python_repl_tool(
code: Annotated[str, "The python code to execute to generate your chart"],
):
"""Use this to execute python code. If you want to see the output of any value,
you should print it with `print(...)`. This is visible to the user"""
try:
result = repl.run(code)
except BaseException as e:
return f"Failed to execute. Error: {repr(e)}"
return f"Successfully executed: \n ```python \n{code}\n``` \n Stdout: {result}"

The Research Agent Team requires access to the web, necessitating tools like the Tavily search tool and a function such as scrape_web_pages (using WebBaseLoader).

The Writing Agent Team needs tools to manage content, including:
- `read_document`
- `write_document`
- `edit_document`
- `create_outline`

Crucially, if the agent needs to generate a report with charts or plots, it must execute code (like matplotlib). This is enabled by implementing a specialized function called the python_repl_tool. This tool is explicitly instructed to execute Python code, and if the user wants to see output, the agent must use the print function.

The General-Purpose Supervisor Node

Since the hierarchical structure requires supervisors at multiple levels (a top-level supervisor and supervisors within each team), a reusable function, make_supervisor_node, is created.

The supervisor’s core task is to route the conversation to the next worker or declare completion. It uses a system prompt that defines its role: “You are a supervisor tasked with managing a conversation between the following workers…”. By employing structured output from the LLM, the supervisor efficiently decides whether to route the execution to a specific member agent or to `finish` the task.

class State(MessagesState):
next: str

def make_supervisor_node(llm: BaseChatModel, members: List[str]) -> str:
options = ["FINISH"] + members
system_prompt = (
"You are a supervisor tasked with managing a conversation between the"
f" following workers: {members}. Given the following user request,"
" respond with the worker to act next. Each worker will perform a"
" task and respond with their results and status. When finished,"
" respond with FINISH."
)

class Router(TypedDict):
next: Literal[*options]

def supervisor_node(state: State) -> Command[Literal[*members, "__end__"]]:
messages = [
{"role": "system", "content": system_prompt}
] + state["messages"]

response = llm.with_structured_output(Router).invoke(messages)
goto = response["next"]
if goto == "FINISH":
goto = END
return Command(goto=goto, update={"next": goto})

return supervisor_node

Building and Testing the Specialized Teams

  • The Research Team
    The Research Team includes a Search Agent and a Web Scraper Agent. Both are implemented as ReAct agents using the create_react_agent function. The Search Agent uses the Tavily tool, and the Web Scraper Agent uses the scrape_web_pages tool. After the team completes its task, its node returns control back to the research supervisor node using the Command object, specifying go to the supervisor.
llm = ChatOpenAI(model = "gpt-4o")
tavily_tool = TavilySearch(max_results = 3)

search_agent = create_react_agent(llm, tools=[tavily_tool])

def search_node(state: State) -> Command[Literal["supervisor"]]:
result = search_agent.invoke(state)
return Command(
update = {
"messages": [HumanMessage(content=result["messages"][-1].content, name="search")]
},
goto = "supervisor"
)

web_scrapper_agent = create_react_agent(llm, tools=[scrape_webpages])

def web_scrapper_node(state: State) -> Command[Literal["supervisor"]]:
result = web_scrapper_agent.invoke(state)
return Command(
update = {
"messages": [HumanMessage(content=result["messages"][-1].content, name="web_scrapper")]
},
goto = "supervisor"
)

research_supervisor_node = make_supervisor_node(llm, ["search", "web_scrapper"])
  • The Writing Team
    The Writing Team is composed of specialized agents reporting to their own supervisor:
    Doc Writer Agent: Instructed to read, write, and edit documents based on outlines provided by note takers.
    Note Taker Agent: Equipped with the create_outline and read_document tools, tasked with generating outlines for the Doc Writer.
    Chart Generating Agent: Uses read_document and the special python_repl_tool to generate plots.
doc_writer_agent = create_react_agent(
llm,
tools = [write_document, edit_document, read_document],
prompt = (
"You can read, write and edit documents based on note taker's outlines. "
"Don't ask follow up questions."
)
)

def doc_writing_node(state: State) -> Command[Literal["supervisor"]]:
result = doc_writer_agent.invoke(state)
return Command(
update = {
"messages": [
HumanMessage(content = result["messages"][-1].content, name = "doc_writer")
]
},
goto = "supervisor",
)

note_taking_agent = create_react_agent(
llm,
tools = [create_outline, read_document],
prompt = (
"You can read documents and create outlines for the document writer."
"Don't ask follow up questions."
)
)


def note_taking_node(state: State) -> Command[Literal["supervisor"]]:
result = note_taking_agent.invoke(state)
return Command(
update = {
"messages": [
HumanMessage(content = result["messages"][-1].content, name = "note_taker")
]
},
goto = "supervisor",
)

chart_generating_agent = create_react_agent(
llm, tools = [read_document, python_repl_tool]
)

def chart_generating_node(state: State) -> Command[Literal["supervisor"]]:
result = chart_generating_agent.invoke(state)
return Command(
update = {
"messages": [
HumanMessage(content = result["messages"][-1].content, name = "chart_generator")
]
},
goto = "supervisor",
)

doc_writing_supervisor_node = make_supervisor_node(
llm, ["doc_writer", "note_taker", "chart_generator"]
)

When any agent within the writing team (e.g., the doc writing node, note taking node, or chart generating node) completes its step, it returns the control flow back to the Writing Supervisor via the Command object.

Lets build both the researach team and writing team

research_builder = StateGraph(State)
research_builder.add_node("supervisor", research_supervisor_node)
research_builder.add_node("search", search_node)
research_builder.add_node("web_scrapper", web_scrapper_node)

research_builder.add_edge(START, "supervisor")
research_graph = research_builder.compile()

writing_builder = StateGraph(State)
writing_builder.add_node("supervisor", doc_writing_supervisor_node)
writing_builder.add_node("doc_writer", doc_writing_node)
writing_builder.add_node("note_taker", note_taking_node)
writing_builder.add_node("chart_generator", chart_generating_node)

writing_builder.add_edge(START, "supervisor")
writing_graph = writing_builder.compile()

Orchestrating the End-to-End Super Graph

Now that we have the research agent and writing agent, we need to define the end-to-end graph to orchestrate the multi-agent system.

teams_supervisor_node = make_supervisor_node(llm, ["research_team", "writing_team"])

def call_research_team(state: State) -> Command[Literal["supervisor"]]:
response = research_graph.invoke({"messages": state["messages"][-1]})
return Command(
update = {
"messages" : [
HumanMessage(
content=response["messages"][-1].content, name = "research_team"
)
]
},
goto="supervisor"
)

def call_writing_team(state: State) -> Command[Literal["supervisor"]]:
response = writing_graph.invoke({"messages": state["messages"][-1]})
return Command(
update = {
"messages" : [
HumanMessage(
content=response["messages"][-1].content, name = "writing_team"
)
]
},
goto="supervisor"
)

super_builder = StateGraph(State)
super_builder.add_node("supervisor", teams_supervisor_node)
super_builder.add_node("research_team", call_research_team)
super_builder.add_node("writing_team", call_writing_team)

super_builder.add_edge(START, "supervisor")
super_graph = super_builder.compile()

Optionally, we can visualize the graph to see how it looks

from IPython.display import Image, display

display(Image(super_graph.get_graph().draw_mermaid_png()))

We will get a graph similar to the one above.

Demonstration and Results

When given a complex user input, such as “research why the gold price has been increasing crazily in 2025. Come up with the reasons and compile a report”, the hierarchical system begins its execution.

Here it is in the code:

for s in super_graph.stream(
{
"messages": [
("user", "Research why the gold price has been increasing crazily in 2025. Come up with the reasons and compile a report. Do not try to generate any charts or plots. Just a report with text and numbers will do.")
]
},
{"recursion_limit": 1000}
):
print(s)
print("....")

The system demonstrates a dynamic flow, cycling between the Research Team (to gather data) and the Writing Team (to process and compile that data). The entire collaboration successfully resulted in two key outputs being saved locally:

  1. An outline (e.g., covering economic uncertainty, geopolitical tensions, and investment speculation).
  2. A final report that elaborates on the points established in the outline, discussing, for example, prices exceeding $4,000 per ounce due to intervening factors.

If you truly wish to see the agent in action, please switch over to the video and take a look. Thanks.

Shout out

If you liked this article, why not follow us on X where we share AI news and research updates ~daily.

Also, please subscribe to our YouTube channel, where we explain AI concepts and papers visually.

Lastly, please clap, and let’s celebrate you reaching the end of this story.

Conclusion

By implementing a hierarchical multi-agent system using the LangGraph framework, we achieve a powerful architecture where specialized teams collaborate under coordinated supervision. This structure allows for complex tasks, like research and report writing, to be executed robustly, efficiently, and with inherent specialization, moving far beyond the capabilities of even the most sophisticated single agent.


Building Multi-Agent Systems with LangGraph — A Comprehensive Guide was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by S Sankar


Print Share Comment Cite Upload Translate Updates
APA

S Sankar | Sciencx (2025-11-21T15:06:24+00:00) Building Multi-Agent Systems with LangGraph — A Comprehensive Guide. Retrieved from https://www.scien.cx/2025/11/21/building-multi-agent-systems-with-langgraph-a-comprehensive-guide/

MLA
" » Building Multi-Agent Systems with LangGraph — A Comprehensive Guide." S Sankar | Sciencx - Friday November 21, 2025, https://www.scien.cx/2025/11/21/building-multi-agent-systems-with-langgraph-a-comprehensive-guide/
HARVARD
S Sankar | Sciencx Friday November 21, 2025 » Building Multi-Agent Systems with LangGraph — A Comprehensive Guide., viewed ,<https://www.scien.cx/2025/11/21/building-multi-agent-systems-with-langgraph-a-comprehensive-guide/>
VANCOUVER
S Sankar | Sciencx - » Building Multi-Agent Systems with LangGraph — A Comprehensive Guide. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/11/21/building-multi-agent-systems-with-langgraph-a-comprehensive-guide/
CHICAGO
" » Building Multi-Agent Systems with LangGraph — A Comprehensive Guide." S Sankar | Sciencx - Accessed . https://www.scien.cx/2025/11/21/building-multi-agent-systems-with-langgraph-a-comprehensive-guide/
IEEE
" » Building Multi-Agent Systems with LangGraph — A Comprehensive Guide." S Sankar | Sciencx [Online]. Available: https://www.scien.cx/2025/11/21/building-multi-agent-systems-with-langgraph-a-comprehensive-guide/. [Accessed: ]
rf:citation
» Building Multi-Agent Systems with LangGraph — A Comprehensive Guide | S Sankar | Sciencx | https://www.scien.cx/2025/11/21/building-multi-agent-systems-with-langgraph-a-comprehensive-guide/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.