Mastering LangGraph: Build a Dynamic Text Analysis Pipeline with AI
This tutorial walks through building a modular text analysis pipeline with LangGraph, incorporating classification, entity extraction, summarization, sentiment analysis, and advanced conditional flow control.
Introduction to LangGraph
LangGraph is a robust framework from LangChain designed to create stateful, multi-actor applications using large language models (LLMs). It enables developers to architect sophisticated AI agents by structuring workflows as graphs, similar to blueprints an architect uses to design a building. This graph-based approach allows seamless connection and coordination of various AI capabilities.
Key Features
- State Management: Maintain persistent information throughout interactions.
- Flexible Routing: Define complex flows between components.
- Persistence: Save and resume workflows as needed.
- Visualization: Visualize the agent's architecture for better understanding.
Setting Up the Environment
To get started, install the necessary packages:
!pip install langgraph langchain langchain-openai python-dotenvObtain and configure your OpenAI API key:
import os
from dotenv import load_dotenv
load_dotenv()
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')Test your setup by invoking a simple prompt:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini")
response = llm.invoke("Hello! Are you working?")
print(response.content)Building the Text Analysis Pipeline
This tutorial demonstrates a pipeline with three stages: text classification, entity extraction, and text summarization.
Defining Agent Memory
The agent's state keeps track of the text and analysis results:
from typing import TypedDict, List
class State(TypedDict):
text: str
classification: str
entities: List[str]
summary: strImplementing Core Capabilities
Each capability is a function that processes the state:
1. Classification Node
def classification_node(state: State):
prompt = PromptTemplate(
input_variables=["text"],
template="Classify the following text into one of the categories: News, Blog, Research, or Other.\n\nText:{text}\n\nCategory:"
)
message = HumanMessage(content=prompt.format(text=state["text"]))
classification = llm.invoke([message]).content.strip()
return {"classification": classification}2. Entity Extraction Node
def entity_extraction_node(state: State):
prompt = PromptTemplate(
input_variables=["text"],
template="Extract all the entities (Person, Organization, Location) from the following text. Provide the result as a comma-separated list.\n\nText:{text}\n\nEntities:"
)
message = HumanMessage(content=prompt.format(text=state["text"]))
entities = llm.invoke([message]).content.strip().split(", ")
return {"entities": entities}3. Summarization Node
def summarization_node(state: State):
prompt = PromptTemplate(
input_variables=["text"],
template="Summarize the following text in one short sentence.\n\nText:{text}\n\nSummary:"
)
message = HumanMessage(content=prompt.format(text=state["text"]))
summary = llm.invoke([message]).content.strip()
return {"summary": summary}Constructing the Workflow
Connect the nodes in sequence:
workflow = StateGraph(State)
workflow.add_node("classification_node", classification_node)
workflow.add_node("entity_extraction", entity_extraction_node)
workflow.add_node("summarization", summarization_node)
workflow.set_entry_point("classification_node")
workflow.add_edge("classification_node", "entity_extraction")
workflow.add_edge("entity_extraction", "summarization")
workflow.add_edge("summarization", END)
app = workflow.compile()Testing the Pipeline
Analyze sample text:
sample_text = """ OpenAI has announced the GPT-4 model, which is a large multimodal model that exhibits human-level performance on various professional benchmarks. It is developed to improve the alignment and safety of AI systems. Additionally, the model is designed to be more efficient and scalable than its predecessor, GPT-3. The GPT-4 model is expected to be released in the coming months and will be available to the public for research and development purposes. """
state_input = {"text": sample_text}
result = app.invoke(state_input)
print("Classification:", result["classification"])
print("\nEntities:", result["entities"])
print("\nSummary:", result["summary"])Extending the Pipeline with Sentiment Analysis
Add sentiment analysis by expanding the state and including a new node:
class EnhancedState(TypedDict):
text: str
classification: str
entities: List[str]
summary: str
sentiment: str
def sentiment_node(state: EnhancedState):
prompt = PromptTemplate(
input_variables=["text"],
template="Analyze the sentiment of the following text. Is it Positive, Negative, or Neutral?\n\nText:{text}\n\nSentiment:"
)
message = HumanMessage(content=prompt.format(text=state["text"]))
sentiment = llm.invoke([message]).content.strip()
return {"sentiment": sentiment}
enhanced_workflow = StateGraph(EnhancedState)
enhanced_workflow.add_node("classification_node", classification_node)
enhanced_workflow.add_node("entity_extraction", entity_extraction_node)
enhanced_workflow.add_node("summarization", summarization_node)
enhanced_workflow.add_node("sentiment_analysis", sentiment_node)
enhanced_workflow.set_entry_point("classification_node")
enhanced_workflow.add_edge("classification_node", "entity_extraction")
enhanced_workflow.add_edge("entity_extraction", "summarization")
enhanced_workflow.add_edge("summarization", "sentiment_analysis")
enhanced_workflow.add_edge("sentiment_analysis", END)
enhanced_app = enhanced_workflow.compile()
enhanced_result = enhanced_app.invoke({"text": sample_text})
print("Classification:", enhanced_result["classification"])
print("\nEntities:", enhanced_result["entities"])
print("\nSummary:", enhanced_result["summary"])
print("\nSentiment:", enhanced_result["sentiment"])Using Conditional Edges for Dynamic Routing
LangGraph supports conditional routing to execute nodes based on the data state.
def route_after_classification(state: EnhancedState) -> str:
category = state["classification"].lower()
return category in ["news", "research"]
conditional_workflow = StateGraph(EnhancedState)
conditional_workflow.add_node("classification_node", classification_node)
conditional_workflow.add_node("entity_extraction", entity_extraction_node)
conditional_workflow.add_node("summarization", summarization_node)
conditional_workflow.add_node("sentiment_analysis", sentiment_node)
conditional_workflow.set_entry_point("classification_node")
conditional_workflow.add_conditional_edges("classification_node", route_after_classification, path_map={
True: "entity_extraction",
False: "summarization"
})
conditional_workflow.add_edge("entity_extraction", "summarization")
conditional_workflow.add_edge("summarization", "sentiment_analysis")
conditional_workflow.add_edge("sentiment_analysis", END)
conditional_app = conditional_workflow.compile()
# Test with news text
news_text = """
OpenAI released the GPT-4 model with enhanced performance on academic and professional tasks. It's seen as a major breakthrough in alignment and reasoning capabilities.
"""
result = conditional_app.invoke({"text": news_text})
print("Classification:", result["classification"])
print("Entities:", result.get("entities", "Skipped"))
print("Summary:", result["summary"])
print("Sentiment:", result["sentiment"])
# Test with blog text
blog_text = """
Here's what I learned from a week of meditating in silence. No phones, no talking—just me, my breath, and some deep realizations.
"""
result = conditional_app.invoke({"text": blog_text})
print("Classification:", result["classification"])
print("Entities:", result.get("entities", "Skipped (not applicable)"))
print("Summary:", result["summary"])
print("Sentiment:", result["sentiment"])This conditional logic allows the agent to skip unnecessary steps, improving efficiency and cost-effectiveness while adapting to the input context.
LangGraph's graph-based approach provides a flexible, modular, and extensible way to build intelligent AI agents for complex text processing tasks.
Сменить язык
Читать эту статью на русском