Object analysis
Before anything is reduced, the engine walks the PDF object graph: pages, content streams, resource dictionaries, font tables, image XObjects, structure trees. The pour can't shrink what it doesn't first understand, and skipping this step is what produces the broken outputs cheap compressors leave behind.
- Outline tree, named destinations, internal links
- Tagged structure (PDF/UA accessibility tree)
- Annotations, redactions, form fields
- Embedded color profiles (ICC) and OutputIntents
- Builds an internal map of every reusable resource
- Detects duplicated images / fonts across pages
- Identifies orphan objects safe to drop later