V1 Standard Fields

Address Field

class AddressField(raw_prediction, reconstructed=False, page_id=None)

A field containing an address value.

Parameters:
  • raw_prediction (dict[str, Any])

  • reconstructed (bool)

  • page_id (int | None)

address_complement: str | None

Address complement.

bounding_box: Quadrilateral | None

A right rectangle containing the word in the document.

city: str | None

City name.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

country: str | None

Country name.

page_id: int | None

The document page on which the information was found.

po_box: str | None

PO Box number.

polygon: Polygon

A polygon containing the word in the document.

postal_code: str | None

Postal code.

raw_value: str | None

The value as it appears on the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

state: str | None

State name.

street_name: str | None

Street name.

street_number: str | None

Street number.

value: str | None

Value of the string.

Amount Field

class AmountField(raw_prediction, reconstructed=False, page_id=None)

A field containing an amount value.

Parameters:
  • raw_prediction (dict[str, Any])

  • reconstructed (bool)

  • page_id (int | None)

bounding_box: Quadrilateral | None

A right rectangle containing the word in the document.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: int | None

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: float | None

The amount value as a float.

Base Field

class BaseField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

Base class for most fields.

Parameters:
  • raw_prediction (dict[str, Any])

  • value_key (str)

  • reconstructed (bool)

  • page_id (int | None)

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: int | None

The document page on which the information was found.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: Any | None

Raw field value

Boolean Field

class BooleanField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

A field containing a boolean value.

Parameters:
  • raw_prediction (dict[str, Any])

  • value_key (str)

  • reconstructed (bool)

  • page_id (int | None)

bounding_box: Quadrilateral | None

A right rectangle containing the word in the document.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: int | None

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: bool | None

The value as it appears on the document.

Text Field

class StringField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

A field containing a text value.

Parameters:
  • raw_prediction (dict[str, Any])

  • value_key (str)

  • reconstructed (bool)

  • page_id (int | None)

bounding_box: Quadrilateral | None

A right rectangle containing the word in the document.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: int | None

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

raw_value: str | None

The value as it appears on the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: str | None

Value of the string.

Classification Field

class ClassificationField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

Represents a classifier value.

Parameters:
  • raw_prediction (dict[str, Any])

  • value_key (str)

  • reconstructed (bool)

  • page_id (int | None)

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: int | None

The document page on which the information was found.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: str

The value as a string.

Company Registration Field

class CompanyRegistrationField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

A company registration item.

Parameters:
  • raw_prediction (dict[str, Any])

  • value_key (str)

  • reconstructed (bool)

  • page_id (int | None)

print()

Additional print function that doesn’t overwrite __str__().

Return type:

str

printable_values()

Printable representation of the field’s value & type.

to_table_line()

Return a table line for RST display.

bounding_box: Quadrilateral | None

A right rectangle containing the word in the document.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: int | None

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

type: str

The type of registration.

value: Any | None

Raw field value

Date Field

class DateField(raw_prediction, reconstructed=False, page_id=None)

A field containing a date value.

Parameters:
  • raw_prediction (dict[str, Any])

  • reconstructed (bool)

  • page_id (int | None)

bounding_box: Quadrilateral | None

A right rectangle containing the word in the document.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

date_object: date | None

Date as a standard Python datetime.date object.

is_computed: bool | None

Whether the field was computed or retrieved directly from the document.

page_id: int | None

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: str | None

The raw field value.

List of Taxes

class Taxes(api_prediction, page_id)

List of tax lines information.

Parameters:
  • api_prediction (list[dict[str, Any]])

  • page_id (int | None)

Tax Line

class TaxField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

Tax line information.

Parameters:
  • raw_prediction (dict[str, Any])

  • value_key (str)

  • reconstructed (bool)

  • page_id (int | None)

to_table_line()

Output in a format suitable for inclusion in an rST table.

Return type:

str

basis: float | None

The tax base.

bounding_box: Quadrilateral | None

A right rectangle containing the word in the document.

code: str | None

The tax code (HST, GST… for Canadian; City Tax, State tax for US, etc..).

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: int | None

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

rate: float | None

The tax rate.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: float | None

The amount of the tax line.

Locale Field

class LocaleField(raw_prediction, reconstructed=False, page_id=None)

The locale detected on the document.

Parameters:
  • raw_prediction (dict[str, Any])

  • reconstructed (bool)

  • page_id (int | None)

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

country: str | None

The ISO 3166-1 alpha-2 code of the country.

currency: str | None

The ISO 4217 code of the currency.

language: str | None

The ISO 639-1 code of the language.

page_id: int | None

The document page on which the information was found.

reconstructed: bool

Whether the field was reconstructed from other fields.

value: Any | None

Raw field value

Payment Details

class PaymentDetailsField(raw_prediction, value_key='iban', account_number_key='account_number', iban_key='iban', routing_number_key='routing_number', swift_key='swift', reconstructed=False, page_id=None)

Information on a single payment.

Parameters:
  • raw_prediction (dict[str, Any])

  • value_key (str)

  • account_number_key (str)

  • iban_key (str)

  • routing_number_key (str)

  • swift_key (str)

  • reconstructed (bool)

  • page_id (int | None)

account_number: str | None

Account number

bounding_box: Quadrilateral | None

A right rectangle containing the word in the document.

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

iban: str | None

Account IBAN

page_id: int | None

The document page on which the information was found.

polygon: Polygon

A polygon containing the word in the document.

reconstructed: bool

Whether the field was reconstructed from other fields.

routing_number: str | None

Account routing number

swift: str | None

Bank’s SWIFT code

value: Any | None

Raw field value

Position

class PositionField(raw_prediction, value_key='polygon', reconstructed=False, page_id=None)

A field indicating a position or area on the document.

Parameters:
  • raw_prediction (dict[str, Any])

  • value_key (str)

  • reconstructed (bool)

  • page_id (int | None)

bounding_box: Quadrilateral | None

Straight rectangle of cropped area (does not exceed the canvas)

confidence: float

Base field object.

Parameters:
  • raw_prediction – Prediction object from HTTP response

  • value_key – Key to use in the abstract_prediction dict

  • reconstructed – Bool for reconstructed object (not extracted in the API)

  • page_id – Page number for multi-page PDF

page_id: int | None

The document page on which the information was found.

polygon: Polygon | None

Polygon of cropped area

quadrangle: Quadrilateral | None

Quadrangle of cropped area (does not exceed the canvas)

reconstructed: bool

Whether the field was reconstructed from other fields.

rectangle: Quadrilateral | None

Oriented rectangle of cropped area (may exceed the canvas)

value: Polygon | None

Polygon of cropped area, identical to the polygon property.