1. Prompt Chaining - Building Step-by-Step AI Workflows

Prompt Chaining is the Assembly Line for AI. Build complex results from simple, specialized steps.

Sep 21, 2025

Prompt Chaining — Agentic Design Pattern Series

Prompt Chaining is a foundational pattern where you link multiple LLM calls together, using the output of one step as the input for the next, to create sophisticated, multi-step workflows.

The Assembly Line for AI. Build complex results from simple, specialized steps.

This pattern is your starting point for moving beyond single-prompt toys to building reliable AI-powered automations. By breaking down a complex task (like "research and write a report") into a sequence of smaller, more manageable sub-tasks ("find sources," "extract key points," "draft the report," "format the output"), you dramatically increase the quality and reliability of the final result. For a business, this means turning a 50%-reliable AI feature into a 95%-reliable one.

📺 Diagram and Videos

Sequential Chain with LangChain (This video provides a great conceptual overview and code examples for sequential chains).

No Code Implementation of Chains (A clear, concise explanation of the core concept).

🚩 What Is Prompt Chaining?

"The art of advanced prompting isn't about crafting one perfect, monolithic prompt. It's about knowing how to break a problem down and build a 'prompt assembly line' where each station does one thing perfectly."

Prompt Chaining is the technique of creating a sequence of LLM calls where the output of one call becomes the direct input for the next. This creates a logical, step-by-step workflow. The intended outcome is to achieve a complex or high-quality result that would be difficult or unreliable to obtain with a single, massive prompt.

🏗 Use Cases

Scenario: A marketing team needs to repurpose a long, technical whitepaper into a short, engaging Twitter thread. Using a single prompt like "Turn this 10-page paper into a Twitter thread" often yields generic or inaccurate results.

Applying the Pattern:

Step 1 (Summarize): An initial prompt extracts the 3-5 most critical takeaways from the whitepaper.
Step 2 (Re-Angle for Audience): The output (the key takeaways) is fed into a second prompt that rewrites them in a punchy, non-technical tone suitable for a general audience on Twitter.
Step 3 (Format as Thread): The rewritten points are then passed to a final prompt that formats them into a numbered Twitter thread, adding relevant hashtags and a call-to-action.

Outcome: The final Twitter thread is high-quality, accurate, and perfectly formatted, a result achieved reliably every time.

General Use: This pattern is perfect for any multi-step process that must be executed in a specific order.

Summarize-then-Translate: The first prompt summarizes a long article, the second translates that summary into another language.
Extract-then-Format: The first pulls out key data points (names, dates, locations); the second formats them into JSON or a table.
Brainstorm-then-Elaborate: The first creates a list of ideas; the next expands on a selected one.

💻 Sample Code / Pseudocode

This pseudocode in Python demonstrates a simple chain for extracting a topic and then writing an explanation.

Python

def call_llm(prompt):
  # In a real application, this would be an API call to an LLM provider.
  # For this example, we'll simulate the response.
  print(f"--- Calling LLM with prompt: ---\n{prompt[:100]}...\n")
  if "Extract the key topic" in prompt:
    return "Quantum Computing"
  elif "Write a 3-paragraph explanation" in prompt:
    return "Quantum computing is a revolutionary field... [full explanation here] ..."
  return "Error: Unknown prompt."

def run_summarize_and_explain_chain(long_text):
  """
  A simple chain with two steps:
  1. Extract the main topic from a long text.
  2. Write an explanation of that topic.
  """
  # Step 1: First LLM call
  prompt_1 = f"Extract the key topic from this text: {long_text}"
  topic = call_llm(prompt_1)
  print(f"--- Output of Step 1: ---\n{topic}\n")

  # Step 2: Second LLM call, using the output from Step 1 as input
  prompt_2 = f"Write a 3-paragraph explanation of the topic: {topic}"
  explanation = call_llm(prompt_2)
  print(f"--- Output of Step 2 (Final Result): ---\n{explanation}\n")

  return explanation

# --- Execute the chain ---
initial_input = "A long article discussing the principles of superposition and entanglement..."
run_summarize_and_explain_chain(initial_input)

🟢 Pros

Simplicity: Easy to implement and understand.
Reliability: Breaking tasks into smaller, focused sub-tasks increases the chances of success.
Specialization: Each prompt can be finely optimized for its immediate purpose, improving overall quality.

🔴 Cons

Latency: Sequential steps may lead to slower total response time as each step must wait for the previous one.
Error Propagation: Early mistakes negatively affect all following outputs.
Rigidity: Fixed flows cannot dynamically adapt based on the input.
Token Usage: Context and outputs accumulate, which can result in high token consumption for long chains.

🛑 Anti-Patterns (Mistakes to Avoid)

Overly Long Chains: Avoid chaining more than 4-5 steps without an intermediate summarization or data reduction step. This can lead to a loss of focus and excessive token costs.
Ignoring Step Validation: Never assume the output of a step is correct. If one step fails to produce a valid output (e.g., malformed JSON), the entire chain breaks.
Monolithic Design: Don't build one massive, rigid chain for everything. Design smaller, reusable chains that can be combined.
Unrelated Task Chaining: Don't chain sub-tasks that are not logically dependent. If tasks can be run independently, use the Parallelization pattern instead.

🛠 Best Practices

Validate Between Steps: Always parse and validate the output of each step before passing it to the next. For structured data, use a validation library like Pydantic.
Summarize for Long Chains: If a chain has many steps, include a "summarize context" step periodically to keep the core information without bloating the context window.
Modularize Prompts: Store each prompt as a separate template. This makes them easier to test, version, and reuse across different chains.

🧪 Sample Test Plan

Unit Tests: Test each prompt in the chain individually. Mock the LLM call and provide a known input to the prompt template to ensure it formats correctly.
Python

# Example using pytest for a single prompt template
def test_summarize_prompt():
  prompt_template = "Summarize this text: {text}"
  formatted_prompt = prompt_template.format(text="This is a test.")
  assert formatted_prompt == "Summarize this text: This is a test."

End-to-End (Integration) Tests: Test the entire chain with a set of golden "input/output" pairs. Provide a real input and assert that the final output contains the expected keywords, structure, or information.
Robustness Tests: Actively try to break the chain. Feed it edge-case inputs like empty strings, very long documents, text in a different language, or irrelevant content to see how it handles failures.
Performance Tests: Measure the two most important metrics: latency and token cost. Run the chain 50-100 times with representative inputs and log the average time and tokens consumed to identify bottlenecks.

🤖 LLM as Judge/Evaluator

Recommendation: Use a powerful, separate LLM (like GPT-4 or Gemini 1.5 Pro) as an impartial "judge" to evaluate the quality of your chain's final output.
How to Apply: Create a "scoring prompt" that defines a rubric. Feed it the initial query and the final output of your chain, and ask it to score the result from 1-10 on criteria like Accuracy, Coherence, Format Adherence, and Helpfulness. This is a powerful way to automate A/B testing between two versions of your chain.

🗂 Cheatsheet

Variant: Summarize-Translate

When to Use: Creating multilingual content from a single source.
Key Tip: Keep the intermediate summary concise and fact-focused to ensure accurate translation.

Variant: Extract-Format

When to Use: When you need structured data (like JSON or CSV) from unstructured text.
Key Tip: Always validate the fields and data types in the final output to catch errors early.

Variant: Brainstorm-Elaborate

When to Use: Ideation, creative writing, and strategic planning.

Key Tip: Use a separate step to rank or filter the brainstormed ideas before elaborating on the best ones.

Relevant Content

LangChain Documentation on Chains: https://python.langchain.com/docs/modules/chains/ (The canonical open-source implementation of this pattern).
Foundational Concept: This pattern is a direct application of the pipeline design pattern in software engineering, adapted for LLM-based workflows.

📅 Next Pattern

Stay tuned for our next article in the series: Design Pattern: Routing — Building Smart AI Workflows That Can Make Decisions.