Back / Technical Deep Dive
Technical Documentation

SEO Article Generation
System Architecture

A production-ready, multi-agent LangGraph pipeline for generating SEO-optimized articles with real-time monitoring, crash recovery, and comprehensive quality assurance — deployed on AWS Lambda with Bedrock-powered Claude models.

LangGraph FastAPI AWS Bedrock Lambda PostgreSQL Claude Sonnet DynamoDB SSE Streaming
Contents
  1. Project Overview
  2. System Architecture & Pipeline
  3. Agent Node Catalogue
  4. Conditional Routing Logic
  5. Database Architecture
  6. AWS Infrastructure
  7. Real-Time Log Streaming (SSE)
  8. QA Scoring Engine
  9. Configuration System
  10. Tech Stack
  11. Design Decisions

Project Overview

The SEO Article Generation System is a sophisticated application that automatically generates high-quality, SEO-optimized articles through a 7-node multi-agent AI pipeline. It combines real-time research via SerpAPI, intelligent content generation via Claude or GPT-4, comprehensive quality assurance, and crash-recovery capabilities.

What the pipeline does

Input
Research
Outline
Write
QA Score
Article
Production-grade: Every component is async, all data flows through a strongly-typed Pydantic state model, PostgreSQL checkpointing enables crash recovery, and log streaming keeps you observable at all times.

System Architecture & Pipeline

The system uses a LangGraph StateGraph with conditional routing, per-node retry logic, and persistent PostgreSQL checkpointing.

INPUT { topic, word_count, language }
🚪 FastAPI Gateway POST /jobs/
🎭 ORCHESTRATOR ReAct
🔍 RESEARCH ReAct · 3 retries
📐 OUTLINE ReAct · 3 retries
✍️ WRITER Sequential · 3 retries
QA (score ≥ 80?) ReAct
PASS→ Output
FAIL < 80↺ Writer (max 3)
MAX REVISIONS→ Best-effort
📦 OUTPUT BUILDER Plain fn
OUTPUT { article, seo_metadata, score, links }

Every node that fails past its retry budget routes to an ERROR HANDLER terminal node. The entire state object (ArticleGenerationState) is the single source of truth — no side channels, no globals.

Agent Node Catalogue

Six operational nodes plus one terminal error handler. Each has an independent retry budget.

Node Type Tools Max Iter
ORCHESTRATOR ReAct validate_input_tool, job_init_tool 10
RESEARCH ReAct serp_fetch_tool, theme_extractor_tool, faq_extractor_tool 10
OUTLINE ReAct outline_builder_tool, keyword_mapper_tool 10
WRITER Sequential article_writer_tool, linking_tool, metadata_generator_tool N/A
QA ReAct seo_validator_tool, score_calculator_tool 8
OUTPUT BUILDER Plain fn — (assembly only) N/A
ERROR HANDLER Plain fn — (terminal state) N/A
Why is the Writer Sequential, not ReAct? The article writer calls tools in a deterministic order: article_writer → linking_tool → metadata_generator. A ReAct agent loop adds no value here and risks non-determinism in section ordering. Direct sequential calls give precise control over section-by-section generation.

Conditional Routing Logic

LangGraph conditional edges route the graph after each node using simple boolean checks on the shared state object.

After Node Condition Destination
RESEARCH retry_counts["research"] ≥ 3 → error_handler
serp_results present → outline
else ↺ research (retry)
OUTLINE retry_counts["outline"] ≥ 3 → error_handler
outline present → writer
WRITER retry_counts["writer"] ≥ 3 → error_handler
article_draft present → qa
QA status == "done" → output_builder
revision_count ≥ 3 → output_builder (best-effort)
score < 80 ↺ writer (revision loop)

The QA → Writer loop passes accumulated qa_result.issues and qa_result.suggestions directly in state — the Writer receives first-class feedback on each revision cycle.

Database Architecture

PostgreSQL is used for two distinct purposes with completely separate code paths — both in the same database instance.

SQLAlchemy Async ORM
generation_jobs
job_id UUID PK
topic TEXT
word_count INT
status VARCHAR(20)
thread_id TEXT UNIQUE
seo_score INT
error_message TEXT
timestamps

generated_articles
id UUID PK
job_id UUID FK
final_article TEXT
seo_metadata JSON
keywords JSON
internal_links JSON
external_refs JSON
seo_score INT
LangGraph PostgresSaver
checkpoints
thread_id TEXT
checkpoint_ns TEXT
checkpoint_id TEXT
parent_checkpoint_id

checkpoint_writes
task writes per node

checkpoint_blobs
serialized state blobs

Key: seo-job:{job_id}
Links back to generation_jobs.thread_id

Crash Resume Pattern

# Pipeline: Research ✅ → Outline ✅ → Writer 💥 CRASH # On restart — pass None as input to signal "resume": graph.invoke(None, config={"configurable": {"thread_id": "seo-job:abc123"}}) # LangGraph finds last checkpoint → skips Research + Outline → retries Writer only.

AWS Infrastructure

The system is deployable as a serverless application using AWS SAM, Docker containers, and Lambda with API Gateway.

🌐
API Gateway (REST)
Routes all HTTP traffic. Endpoints: /jobs/*, /health, /docs
λ
Lambda Function
Docker container · Python 3.12 · FastAPI + Mangum · 3008 MB RAM · 15 min timeout
🗄️
DynamoDB
Table: SEO-article-write. Stores job metadata and generated articles. On-demand pricing.
🧠
AWS Bedrock
Model: anthropic.claude-sonnet-4-6. Used for all LLM calls via ChatBedrockConverse. Falls back to OpenAI if configured.
📋
CloudWatch Logs
Log group: /aws/lambda/SEOArticleGenerationAPI. 7-day retention. Real-time tailing via SAM CLI.
🔐
IAM Role
Least-privilege. Grants Lambda access to DynamoDB, Bedrock, and CloudWatch only.

LLM Provider Factory

The system uses a factory pattern to support both OpenAI and AWS Bedrock:

# .env LLM_PROVIDER=bedrock # Bedrock config AWS_PROFILE=personal-dev-dev BEDROCK_MODEL=anthropic.claude-sonnet-4-6 BEDROCK_WRITER_MODEL=anthropic.claude-sonnet-4-6 # Or switch to OpenAI with a single line LLM_PROVIDER=openai OPENAI_API_KEY=sk-proj-...
Why Bedrock? AWS Bedrock provides access to Claude Sonnet models without direct Anthropic API calls — traffic stays within the AWS network, billing is unified, and IAM controls access rather than API key management.

Serverless Deployment

Deployed via AWS SAM + Docker. UV package manager cuts build time from ~3 min (pip) to ~15 seconds.

FilePurpose
template.yamlSAM/CloudFormation — Lambda, API Gateway, IAM roles
lambda_handler.pyEntry point wrapping FastAPI with Mangum
Dockerfile.lambdaProduction image using UV · Python 3.12 base
samconfig.tomlSAM CLI config for dev/prod environments

Real-Time Log Streaming (SSE)

Every log record generated by the pipeline is broadcast to connected browsers via Server-Sent Events, giving you a live window into agent execution.

_BroadcastHandler (logging.Handler)
    │ attaches to root logger — captures every record
    ▼
_buffer: deque(maxlen=400) ← ring buffer, newest evicts oldest
    │
    ├──► _queues: List[asyncio.Queue] ← one Queue per SSE client
    │
    └──► loop.call_soon_threadsafe() ← safe cross-thread delivery

GET /logs/stream
   1. Replay entire _buffer → new client catches up instantly
   2. Listen on personal Queue (20s timeout)
   3. Emit: data: {"level":"INFO","name":"...","text":"..."}
   4. Send ": keep-alive" ping every 20s
   5. Remove queue on client disconnect
Why asyncio.to_thread? The LangGraph graph is synchronous. Running it directly would block the asyncio event loop, preventing SSE clients from receiving live log updates during article generation. asyncio.to_thread offloads the graph to a thread pool, keeping the event loop free for streaming.

QA Scoring Engine

The QA agent starts every article at a score of 100 and applies penalties for SEO issues. Pass threshold is 80.

keyword_in_h1
−15 pts · Primary keyword absent from H1
keyword_in_intro
−15 pts · Absent from first 500 chars
word_count_severe
−30 pts · <60% or >200% of target
word_count_significant
−20 pts · 60–75% of target
keyword_density_range
−10 pts · Density <0.5% or >3.0%
h2_keyword_coverage
−10 pts · Fewer than 2 H2s have keyword
title_tag_length
−10 pts · title_tag > 60 chars
meta_description_length
−10 pts · meta_desc > 160 chars
heading_hierarchy
−10 pts · H3 without parent H2
short_section
−5 pts · H2 section < 100 words
link_placeholders
−5 pts · Unresolved [LINK] markers
content_truncated
−5 pts · Article ends mid-sentence

If score < 80 and revision_count < 3, the QA agent routes back to the Writer with a structured qa_result containing issues and improvement suggestions. After 3 revisions, the best-effort article is published regardless.

Configuration System

All tunable parameters live in three YAML files under config/ — no hardcoded values in agent code.

FileContainsKey params
hyperparams.yaml Numerical thresholds max_retries=3, max_revisions=3, qa.pass_score=80, word_count_min=500
settings.yaml App settings LLM model, temperature, CORS origins, logging format
prompts.yaml Agent system prompts One prompt per node: research, outline, writer, qa
from config.config import cfg cfg.hyperparams.pipeline.max_retries # 3 cfg.hyperparams.qa.pass_score # 80 cfg.hyperparams.article.word_count_default # 1500 cfg.prompts.agents.research # system prompt string cfg.settings.app.title # "SEO Article Generation API"

Tech Stack

Agent Framework
LangGraph
Stateful graph, conditional edges, checkpoint persistence
LLM
Claude Sonnet 4.6
Via AWS Bedrock (or OpenAI GPT-4 fallback)
API
FastAPI
Async REST endpoints, CORS, SSE log streaming
Database
PostgreSQL 16
ORM tables + LangGraph checkpoint tables
ORM
SQLAlchemy 2.0
Async CRUD with asyncpg driver
Checkpointing
PostgresSaver
LangGraph crash recovery and pipeline resume
Validation
Pydantic v2
All agent I/O models, auto-truncation validators
Search Data
SerpAPI
Real SERP results with mock fallback for offline tests
AWS Runtime
Lambda + Mangum
Serverless FastAPI via ASGI adapter
Build
UV + Docker
10–100× faster builds; reproducible container images
Testing
Pytest + asyncio
Unit, integration, and end-to-end graph tests
Config
YAML + dataclasses
Decoupled hyperparams, prompts, and app settings

Design Decisions

Eight core principles that guided every architectural choice in this system.

Principle 01
State as Single Source of Truth
Every node reads from and writes to ArticleGenerationState only. No global variables, no side-channel communication between agents.
Principle 02
Append-Only Error Reducer
errors: Annotated[List[str], operator.add] — any node can log failures without erasing previous errors from other nodes.
Principle 03
Independent Per-Node Retry Budgets
retry_counts["research/outline/writer/qa"] — each agent fails independently. One node retrying doesn't penalise another node's budget.
Principle 04
QA Revision Loop with Hard Ceiling
max_revisions = 3. After that, publish best-effort. Prevents infinite writer↔QA cycles that could run forever on marginal content.
Principle 05
Pydantic Validators as Safety Nets
SeoMetadata auto-truncates over-length strings. QAResult score is range-clamped. No crashes on LLM output drift or hallucination.
Principle 06
Sequential Writer, Not ReAct
Direct tool calls: article_writer → linking_tool → metadata_generator. Deterministic section generation — agent loop adds no value and risks ordering bugs.
Principle 07
SSE + asyncio.to_thread
Graph runs in a thread pool. Event loop stays free. Connected browsers see live logs while pipeline runs — zero blocking.
Principle 08
Two DB Layers, One Database
SQLAlchemy ORM owns job/article records. PostgresSaver owns checkpoint blobs. Same Postgres instance, cleanly separated concerns.

Full Architecture Specification

Download the complete technical document — all diagrams, decision tables, and API specifications in Markdown format.

Download Architecture Doc Launch Demo