Architecture¶

md2pydantic follows a Seek, Clean, Validate pipeline. Each stage is handled by a dedicated module.

Pipeline¶

Markdown input
    │
    ▼
┌──────────┐
│ Scanner  │  Identify candidate blocks
└────┬─────┘
     │
     ▼
┌──────────────┐
│ Transformer  │  Convert to Python dicts
└────┬─────────┘
     │
     ▼
┌───────────┐
│ Validator │  Validate against Pydantic model
└────┬──────┘
     │
     ▼
Typed model instances

Stage 1: Scanner¶

Module: parser.py

Uses regex and heuristics to identify candidate blocks within the Markdown input:

Fenced code blocks (```json, ```yaml)
Inline JSON objects ({...})
Pipe-delimited Markdown tables

Handles LLM-specific quirks like unclosed fences, trailing prose, extra backticks, tilde fences (~~~), and nested structures.

Output: CodeBlock or TableBlock objects with content and source location metadata.

Stage 2: Transformer¶

Module: transformers.py

Converts raw extracted content into Python dictionaries:

JSON blocks -- parses JSON with recovery for trailing commas, single quotes, unquoted keys, and truncated output
YAML blocks -- parses YAML via pyyaml
Tables -- converts rows to dicts using headers as keys

Also performs pre-processing: boolean word mapping (Yes/No to True/False) and null sentinel replacement.

Output: Python dict objects ready for Pydantic.

Stage 3: Validator¶

Module: validators.py

Passes dictionaries to the user-defined Pydantic v2 model. Leverages Pydantic's native type coercion (str to int, str to float, str to datetime, etc.).

Returns either a validated model instance or structured error details with field-level information.

Output: ValidationResult with typed data or errors.

Orchestrator¶

Module: converter.py

MDConverter is the public API that orchestrates the full pipeline. It is the only class users interact with directly.

Module Map¶

Module	Role
`converter.py`	Public API (`MDConverter`)
`parser.py`	Block detection and scanning
`transformers.py`	Raw content to Python dicts
`validators.py`	Pydantic model validation
`models.py`	Internal types and exceptions