Common Fields

Documents

Document

class Document(inference_type, raw_response)

Base class for all predictions.

extras: Optional[Extras]

Potential Extras fields sent back along the prediction

filename: str

Name of the input document

id: str

Id of the document as sent back by the server

inference: Inference[TypeVar(TypePrediction, bound= Prediction), TypeVar(TypePage, bound= Page)]

Result of the base inference

n_pages: int

Amount of pages in the document

ocr: Optional[Ocr]

Potential raw text results read by the OCR (limited feature)

Page

class Page(prediction_type, raw_prediction)

Base Page object for predictions.

id: int

Id of the current page.

orientation: Optional[OrientationField]

Orientation of the page

prediction: TypeVar(TypePrediction, bound= Prediction)

Type of Page prediction.

Page Fields

Orientation

class OrientationField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

The clockwise rotation to apply (in degrees) to make the image upright.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • value_key (str) –

  • reconstructed (bool) –

  • page_id (Optional[int]) –

value: int

Degrees as an integer.

API

ApiRequest

class ApiRequest(json_response)

Information on the API request made to the server.

Parameters:

json_response (dict) –

status_code: int

HTTP status code.

ApiResponse

class ApiResponse(raw_response)

Base class for responses sent by the server.

Serves as a base class for responses to both synchronous and asynchronous calls.

Parameters:

raw_response (Dict[str, Any]) –

api_request: ApiRequest

Results of the request sent to the API.

property raw_http: str

Displays the result of the raw response as json string.

Prediction

class Prediction(raw_prediction, page_id=None)

Base Prediction class.

Parameters:
  • raw_prediction (Dict[str, Any]) –

  • page_id (Optional[int]) –

Asynchronous Parsing

AsyncPredictResponse

class AsyncPredictResponse(inference_type, raw_response)

Async Response Wrapper class for a Predict response.

Links a Job to a future PredictResponse.

job: Job

Job object link to the prediction. As long as it isn’t complete, the prediction doesn’t exist.

Job

class Job(json_response)

Job class for asynchronous requests.

Will hold information on the queue a document has been submitted to.

Parameters:

json_response (dict) –

available_at: Optional[datetime]

Timestamp of the request after it has been completed.

error: Optional[Dict[str, Any]]

Information about an error that occurred during the job processing.

id: str

ID of the job sent by the API in response to an enqueue request.

issued_at: datetime

Timestamp of the request reception by the API.

millisecs_taken: int

Time (ms) taken for the request to be processed by the API.

status: str

Status of the request, as seen by the API.

Miscellaneous Parsing

FeedbackResponse

class FeedbackResponse(server_response)

Wrapper for feedback response.

Parameters:

server_response (Dict[str, Any]) –

OCR Extraction

OCR

class Ocr(raw_prediction)

OCR extraction from the entire document.

Parameters:

raw_prediction (Dict[str, Any]) –

mvision_v1: MVisionV1

Mindee Vision v1 results.

MVisionV1

class MVisionV1(raw_prediction)

Mindee Vision V1.

Parameters:

raw_prediction (Dict[str, Any]) –

pages: List[OcrPage]

List of pages.

OcrPage

class OcrPage(raw_prediction)

OCR extraction for a single page.

Parameters:

raw_prediction (Dict[str, Any]) –

property all_lines: List[OcrLine]

All the words on the page, ordered in lines.

property all_words: List[OcrWord]

All the words on the page, in semi-random order.

OcrLine

class OcrLine(iterable=(), /)

A list of words which are on the same line.

sort_on_x()

Sort the words on the line from left to right.

Return type:

None

OcrWord

class OcrWord(raw_prediction)

A single word.

Parameters:

raw_prediction (Dict[str, Any]) –

bounding_box: Optional[Quadrilateral]

A right rectangle containing the word in the document.

confidence: float

The confidence score.

polygon: Polygon

A polygon containing the word in the document.

text: str

The extracted text.