docTR: Document Text Recognition#
State-of-the-art Optical Character Recognition made seamless & accessible to anyone, powered by TensorFlow 2 & PyTorch

DocTR provides an easy and powerful way to extract valuable information from your documents:
🧾 for automation: seamlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents.
👩🔬 for research: quickly compare your own architectures speed & performances with state-of-art models on public datasets.
Main Features#
🤖 Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters
⚡ User-friendly, 3 lines of code to load a document and extract text with a predictor
🚀 State-of-the-art performance on public document datasets, comparable with GoogleVision/AWS Textract
⚡ Optimized for inference speed on both CPU & GPU
🐦 Light package, minimal dependencies
🛠️ Actively maintained by Mindee
🏭 Easy integration (available templates for browser demo & API deployment)
Model zoo#
Text detection models#
Text recognition models#
Supported datasets#
FUNSD from “FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents”.
CORD from “CORD: A Consolidated Receipt Dataset forPost-OCR Parsing”.
SROIE from ICDAR 2019.
IIIT-5k from CVIT.
Street View Text from “End-to-End Scene Text Recognition”.
SynthText from Visual Geometry Group.
SVHN from “Reading Digits in Natural Images with Unsupervised Feature Learning”.
IC03 from ICDAR 2003.
IC13 from ICDAR 2013.
IMGUR5K from “TextStyleBrush: Transfer of Text Aesthetics from a Single Example”.
MJSynth from “Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition”.