Standard Fields

Base Field

class BaseField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

Base class for most fields.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • value_key (str) –

  • reconstructed (bool) –

  • page_id (Optional[int]) –

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: Optional[int]

The document page on which the information was found.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: Optional[Any]

Raw field value

Text Field

class StringField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

A field containing a text value.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • value_key (str) –

  • reconstructed (bool) –

  • page_id (Optional[int]) –

bounding_box: Optional[Quadrilateral]

A right rectangle containing the word in the document.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: Optional[int]

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

raw_value: Optional[str]

The value as it appears on the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: Optional[str]

Raw field value

Classification Field

class ClassificationField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

Represents a classifier value.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • value_key (str) –

  • reconstructed (bool) –

  • page_id (Optional[int]) –

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: Optional[int]

The document page on which the information was found.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: str

The value as a string.

Company Registration Field

class CompanyRegistrationField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

A company registration item.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • value_key (str) –

  • reconstructed (bool) –

  • page_id (Optional[int]) –

print()

Additional print function that doesn’t overwrite __str__().

Return type:

str

bounding_box: Optional[Quadrilateral]

A right rectangle containing the word in the document.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: Optional[int]

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

type: str

The type of registration.

value: Optional[Any]

Raw field value

Amount Field

class AmountField(raw_prediction, reconstructed=False, page_id=None)

A field containing an amount value.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • reconstructed (bool) –

  • page_id (Optional[int]) –

bounding_box: Optional[Quadrilateral]

A right rectangle containing the word in the document.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: Optional[int]

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: Optional[float]

The amount value as a float.

Date Field

class DateField(raw_prediction, reconstructed=False, page_id=None)

A field containing a date value.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • reconstructed (bool) –

  • page_id (Optional[int]) –

bounding_box: Optional[Quadrilateral]

A right rectangle containing the word in the document.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

date_object: Optional[date]

Date as a standard Python datetime.date object.

page_id: Optional[int]

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: Optional[str]

The raw field value.

List of Taxes

class Taxes(api_prediction, page_id)

List of tax lines information.

Parameters:
  • api_prediction (List[Dict[str, Any]]) –

  • page_id (Optional[int]) –

Tax Line

class TaxField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

Tax line information.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • value_key (str) –

  • reconstructed (bool) –

  • page_id (Optional[int]) –

to_table_line()

Output in a format suitable for inclusion in an rST table.

Return type:

str

basis: Optional[float]

The tax base.

bounding_box: Optional[Quadrilateral]

A right rectangle containing the word in the document.

code: Optional[str]

The tax code (HST, GST… for Canadian; City Tax, State tax for US, etc..).

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: Optional[int]

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

rate: Optional[float]

The tax rate.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: Optional[float]

The amount of the tax line.

Locale Field

class LocaleField(raw_prediction, reconstructed=False, page_id=None)

The locale detected on the document.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • reconstructed (bool) –

  • page_id (Optional[int]) –

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

country: Optional[str]

The ISO 3166-1 alpha-2 code of the country.

currency: Optional[str]

The ISO 4217 code of the currency.

language: Optional[str]

The ISO 639-1 code of the language.

page_id: Optional[int]

The document page on which the information was found.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: Optional[Any]

Raw field value

Payment Details

class PaymentDetailsField(raw_prediction, value_key='iban', account_number_key='account_number', iban_key='iban', routing_number_key='routing_number', swift_key='swift', reconstructed=False, page_id=None)

Information on a single payment.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • value_key (str) –

  • account_number_key (str) –

  • iban_key (str) –

  • routing_number_key (str) –

  • swift_key (str) –

  • reconstructed (bool) –

  • page_id (Optional[int]) –

account_number: Optional[str]

Account number

bounding_box: Optional[Quadrilateral]

A right rectangle containing the word in the document.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

iban: Optional[str]

Account IBAN

page_id: Optional[int]

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

routing_number: Optional[str]

Account routing number

swift: Optional[str]

Bank’s SWIFT code

value: Optional[Any]

Raw field value

Position

class PositionField(raw_prediction, value_key='polygon', reconstructed=False, page_id=None)

A field indicating a position or area on the document.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • value_key (str) –

  • reconstructed (bool) –

  • page_id (Optional[int]) –

bounding_box: Optional[Quadrilateral]

Straight rectangle of cropped area (does not exceed the canvas)

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: Optional[int]

The document page on which the information was found.

polygon: Optional[Polygon]

Polygon of cropped area

quadrangle: Optional[Quadrilateral]

Quadrangle of cropped area (does not exceed the canvas)

reconstructed: bool

Whether the field was reconstructed from other fields.

rectangle: Optional[Quadrilateral]

Oriented rectangle of cropped area (may exceed the canvas)

value: Optional[Polygon]

Polygon of cropped area, identical to the polygon property.