An Overview of ByteDance’s Document Parsing Model, Dolphin

Melani Maheswaran

ByteDance's Dolphin is a new multimodal document image parsing model using Heterogenous Anchor Prompting to improve upon OCR and VLM solutions.