******************************** docTR: Document Text Recognition ******************************** State-of-the-art Optical Character Recognition made seamless & accessible to anyone, powered by TensorFlow 2 & PyTorch .. image:: https://github.com/mindee/doctr/releases/download/v0.2.0/ocr.png :align: center DocTR provides an easy and powerful way to extract valuable information from your documents: * |:receipt:| **for automation**: seamlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. * |:woman_scientist:| **for research**: quickly compare your own architectures speed & performances with state-of-art models on public datasets. Main Features ------------- * |:robot:| Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters * |:zap:| User-friendly, 3 lines of code to load a document and extract text with a predictor * |:rocket:| State-of-the-art performance on public document datasets, comparable with GoogleVision/AWS Textract * |:zap:| Optimized for inference speed on both CPU & GPU * |:bird:| Light package, minimal dependencies * |:tools:| Actively maintained by Mindee * |:factory:| Easy integration (available templates for browser demo & API deployment) .. toctree:: :maxdepth: 2 :caption: Getting started :hidden: getting_started/installing notebooks Model zoo ^^^^^^^^^ Text detection models """"""""""""""""""""" * DBNet from `"Real-time Scene Text Detection with Differentiable Binarization" `_ * LinkNet from `"LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation" `_ * FAST from `"FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation" `_ Text recognition models """"""""""""""""""""""" * SAR from `"Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition" `_ * CRNN from `"An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition" `_ * MASTER from `"MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" `_ * ViTSTR from `"Vision Transformer for Fast and Efficient Scene Text Recognition" `_ * PARSeq from `"Scene Text Recognition with Permuted Autoregressive Sequence Models" `_ Supported datasets ^^^^^^^^^^^^^^^^^^ * FUNSD from `"FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents" `_. * CORD from `"CORD: A Consolidated Receipt Dataset forPost-OCR Parsing" `_. * SROIE from `ICDAR 2019 `_. * IIIT-5k from `CVIT `_. * Street View Text from `"End-to-End Scene Text Recognition" `_. * SynthText from `Visual Geometry Group `_. * SVHN from `"Reading Digits in Natural Images with Unsupervised Feature Learning" `_. * IC03 from `ICDAR 2003 `_. * IC13 from `ICDAR 2013 `_. * IMGUR5K from `"TextStyleBrush: Transfer of Text Aesthetics from a Single Example" `_. * MJSynth from `"Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition" `_. * IIITHWS from `"Generating Synthetic Data for Text Recognition" `_. * WILDRECEIPT from `"Spatial Dual-Modality Graph Reasoning for Key Information Extraction" `_. .. toctree:: :maxdepth: 2 :caption: Using docTR :hidden: using_doctr/using_models using_doctr/using_datasets using_doctr/sharing_models using_doctr/using_model_export using_doctr/custom_models_training using_doctr/running_on_aws .. toctree:: :maxdepth: 2 :caption: Package Reference :hidden: modules/datasets modules/io modules/models modules/transforms modules/utils .. toctree:: :maxdepth: 2 :caption: Contributing :hidden: contributing/code_of_conduct contributing/contributing .. toctree:: :maxdepth: 2 :caption: Notes :hidden: changelog