V1 Common Fields

Documents

Document

class Document(inference_type, raw_response)

Base class for all predictions.

Parameters:
  • inference_type (type[Inference])

  • raw_response (dict[str, Any])

extras: Extras | None = None

Potential Extras fields sent back along the prediction

filename: str

Name of the input document

id: str

Id of the document as sent back by the server

inference: Inference[TypePrediction, TypePage]

Result of the base inference

n_pages: int

Amount of pages in the document

ocr: OCR | None = None

Potential raw text results read by the OCR (limited feature)

Page

class Page(prediction_type, raw_prediction)

Base Page object for predictions.

Parameters:
  • prediction_type (type[TypePrediction])

  • raw_prediction (dict[str, Any])

id: int

Id of the current page.

orientation: OrientationField | None = None

Orientation of the page

prediction: TypePrediction

Type of Page prediction.

Page Fields

Orientation

class OrientationField(raw_prediction, value_key='value', reconstructed=False, page_id=None)

The clockwise rotation to apply (in degrees) to make the image upright.

Parameters:
  • raw_prediction (dict[str, Any])

  • value_key (str)

  • reconstructed (bool)

  • page_id (int | None)

value: int

Degrees as an integer.

API

ApiRequest

class ApiRequest(raw_response)

Information on the API request made to the server.

Parameters:

raw_response (dict[str, Any])

status_code: int

HTTP status code.

ApiResponse

class ApiResponse(raw_response)

Base class for responses sent by the server.

Serves as a base class for responses to both synchronous and asynchronous calls.

Parameters:

raw_response (dict[str, Any])

api_request: ApiRequest

Results of the request sent to the API.

property raw_http: str

Displays the result of the raw response as json string.

Prediction

class Prediction(raw_prediction, page_id=None)

Base Prediction class.

Parameters:
  • raw_prediction (dict[str, Any])

  • page_id (int | None)

Asynchronous Parsing

AsyncPredictResponse

class AsyncPredictResponse(inference_type, raw_response)

Async Response Wrapper class for a Predict response.

Links a Job to a future PredictResponse.

Parameters:
  • inference_type (type[TypeInference])

  • raw_response (dict[str, Any])

job: Job

Job object link to the prediction. As long as it isn’t complete, the prediction doesn’t exist.

Job

class Job(json_response)

Job class for asynchronous requests.

Will hold information on the queue a document has been submitted to.

Parameters:

json_response (dict)

available_at: datetime | None = None

Timestamp of the request after it has been completed.

error: dict[str, Any] | None = None

Information about an error that occurred during the job processing.

id: str

ID of the job sent by the API in response to an enqueue request.

issued_at: datetime

Timestamp of the request reception by the API.

millisecs_taken: int

Time (ms) taken for the request to be processed by the API.

status: str

Status of the request, as seen by the API.

Miscellaneous Parsing

PredictResponse

class PredictResponse(inference_type, raw_response)

Response of a prediction request.

This is a generic class, so certain class properties depend on the document type.

Parameters:
  • inference_type (type[TypeInference])

  • raw_response (dict[str, Any])

document: Document

The document object, properly parsed after being retrieved from the server.

Product

class Product(raw_prediction)

Class for keeping track of a product’s info.

Parameters:

raw_prediction (dict[str, Any])

FeedbackResponse

class FeedbackResponse(server_response)

Wrapper for feedback response.

Parameters:

server_response (dict[str, Any])

Workflow Parsing

Execution

class Execution(inference_type, json_response)

Workflow execution class.

Parameters:
  • inference_type (type[Inference])

  • json_response (dict[str, Any])

static parse_date(date_string)

Shorthand to parse the date, if present.

Return type:

datetime | None

Parameters:

date_string (str | None)

available_at: datetime | None

The time at which the file was uploaded to a workflow.

batch_name: str

Identifier for the batch to which the execution belongs.

created_at: datetime | None = None

The time at which the execution started.

file: ExecutionFile

File representation within a workflow execution.

id: str

Identifier for the execution.

inference: Inference[TypePrediction, Page[TypePrediction]] | None

Deserialized inference object.

priority: ExecutionPriority | None = None

Priority of the execution.

reviewed_at: datetime | None

The time at which the file was tagged as reviewed.

reviewed_prediction: GeneratedV1Document | None = None

Reviewed fields and values.

status: str

Execution Status.

type: str | None

Execution type.

uploaded_at: datetime | None = None

The time at which the file was uploaded to a workflow.

workflow_id: str

Identifier for the workflow.

Execution File

class ExecutionFile(raw_response)

Execution File class.

Parameters:

raw_response (dict[str, Any])

alias: str | None

File name.

name: str | None

File name.

WorkflowResponse

class WorkflowResponse(inference_type, raw_response)

Base wrapper for API requests.

Parameters:
  • inference_type (type[Inference])

  • raw_response (dict[str, Any])

execution: Execution

Set the prediction model used to parse the document. The response object will be instantiated based on this parameter.

OCR Extraction

OCR

class OCR(raw_prediction)

OCR extraction from the entire document.

Parameters:

raw_prediction (dict[str, Any])

mvision_v1: MVisionV1

Mindee Vision v1 results.

MVisionV1

class MVisionV1(raw_prediction)

Mindee Vision V1.

Parameters:

raw_prediction (dict[str, Any])

pages: list[OCRPage]

List of pages.

OCRPage

class OCRPage(raw_prediction)

OCR extraction for a single page.

Parameters:

raw_prediction (dict[str, Any])

property all_lines: list[OCRLine]

All the words on the page, ordered in lines.

property all_words: list[OCRWord]

All the words on the page, in semi-random order.

OCRLine

class OCRLine(iterable=(), /)

A list of words which are on the same line.

sort_on_x()

Sort the words on the line from left to right.

Return type:

None

OCRWord

class OCRWord(raw_prediction)

A single word.

Parameters:

raw_prediction (dict[str, Any])

bounding_box: Quadrilateral | None

A right rectangle containing the word in the document.

confidence: float

The confidence score.

polygon: Polygon

A polygon containing the word in the document.

text: str

The extracted text.

Extras

Extras

class Extras(raw_prediction)

Extras collection wrapper class.

Is roughly equivalent to a dict of Extras, with a bit more utility.

Parameters:

raw_prediction (dict[str, Any])

add_artificial_extra(raw_prediction)

Adds artificial extra data for reconstructed extras. Currently only used for full_text_ocr.

Parameters:

raw_prediction (dict[str, Any]) – Raw prediction used by the document.

Cropper Extra

class CropperExtra(raw_prediction, page_id=None)

Contains information on the cropping of a prediction.

Parameters:
  • raw_prediction (dict[str, Any])

  • page_id (int | None)

croppings: list[PositionField]

List of all cropping coordinates.

Full-Text OCR Extra

class FullTextOCRExtra(raw_prediction)

Full Text OCR result.

Parameters:

raw_prediction (dict[str, Any])

RAG Extra

class RAGExtra(raw_prediction)

Contains information on the Retrieval-Augmented-Generation of a prediction.

Parameters:

raw_prediction (dict[str, Any])