Leaderboard

Our frontier models extraction benchmark

Extract API

Rank	Model	Platform	Accuracy
🥇1	retab-large	Retab	97.2%
🥈2	Extend	Extend	91.4%
🥉3	Landing	Landing	89.2%
4	LlamaParse	LlamaIndex	87.8%
5	retab-small	Retab	79.3%
6	Reducto	Reducto	63.5%
7	retab-micro	Retab	58.2%

Methodology

Our benchmark is built on a curated dataset of proprietary documents provided by partner customers. These documents span a wide range of industries and use cases, including invoices, contracts, financial reports, and technical documentation.

Each document has been manually annotated by domain experts to establish ground truth for structured data extraction. We evaluate models by running them with the same prompts through identical extraction tasks and comparing their outputs against these human-verified annotations.

Private Benchmark

The dataset is not publicly available to prevent model overfitting and ensure fair evaluation of real-world performance.

Real-World Data

Documents reflect actual production scenarios, providing meaningful accuracy metrics for enterprise extraction tasks.

Run evaluations on frontier AI capabilities

If you'd like to add your model to this leaderboard or a future version, please contact support@retab.com.

Evaluate your Model