codex-pdf
Structured PDF extraction API that turns complex files into consistent JSON.
codexPDF is Print with Synergy's authoritative, read-only PDF facts reference. Extract once, validate against published schemas, and let downstream tools consume stable document facts.
codexPDF centralizes extraction into one authoritative facts engine so every downstream system can operate from the same versioned truth.
CodexDocument is the stable root contract for extracted PDF facts, independent of consumer product logic.
Published schemas under versioned paths let every payload be validated in CI and runtime workflows.
codexPDF extracts facts only; no edits, no rendering mutations, no hidden policy side effects.
Use extract, probe, validate, and parity commands to wire codex into local tooling and automation.
Projection-based parity checks compare codex output against baseline systems and highlight drift.
Downstream tools can adapt codex output through thin adapters instead of re-parsing every document.
Python package ships typed models and contract-aware primitives for reliable integrations.
Full source under AGPL-3.0-or-later with no closed black-box extraction layer.
Captures the true printed resolution of every embedded image — accounting for scale, not just declared pixels. A 300 DPI image enlarged 2× reports 150 DPI. Gathered per-placement across all pages.
Open source · managed hosting
A toolkit of focused, standalone PDF utilities — extraction, preflight, viewing, assembly, imposition planning, and an asset store. Each one plugs into the prepress workflow you already run. Use the open source yourself, or let us host any single tool for you on work.withsynergy.io.
Structured PDF extraction API that turns complex files into consistent JSON.
Programmatic PDF assembly — a deterministic API build step for rewriting and generating print-ready PDFs.
Detection-only PDF preflight engine — 500+ checks plus the PDF/X-4 conformance suite.
Embeddable PDF viewer with separations, TAC, layers, and annotation overlays.
PDF assay and metadata reporting — surface what's actually inside the file.
WYSIWYG canvas editor for label and packaging artwork — PDF/X-4 output, flexo support, and a full create-to-RIP workflow.
Stateless imposition-planning solver — step-and-repeat, gang, and true-shape nesting.
Content-addressed digital-asset plane — versioned blobs, a presigned data plane, and on-prem agent recall.