Mindee Client
Client
- class Client(api_key='')
Mindee API Client.
See: https://developers.mindee.com/docs/
- Parameters:
api_key (str) –
- create_endpoint(endpoint_name, account_name='mindee', version=None)
Add a custom endpoint, created using the Mindee API Builder.
- Parameters:
endpoint_name (
str
) – The “API name” field in the “Settings” page of the API Builderaccount_name (
str
, default:'mindee'
) – Your organization’s username on the API Builderversion (
Optional
[str
], default:None
) – If set, locks the version of the model to use. If not set, use the latest version of the model.
- Return type:
- enqueue(product_class, input_source, include_words=False, close_file=True, page_options=None, cropper=False, endpoint=None, full_text=False)
Enqueues a document to an asynchronous endpoint.
- Parameters:
product_class (
Type
[Inference
]) – The document class to use. The response object will be instantiated based on this parameter.input_source (
Union
[LocalInputSource
,UrlInputSource
]) – The document/source file to use. Has to be created beforehand.include_words (
bool
, default:False
) – Whether to include the full text for each page. This performs a full OCR operation on the server and will increase response time.close_file (
bool
, default:True
) – Whether toclose()
the file after parsing it. Set toFalse
if you need to access the file after this operation.page_options (
Optional
[PageOptions
], default:None
) – If set, remove pages from the document as specified. This is done before sending the file to the server. It is useful to avoid page limitations.cropper (
bool
, default:False
) – Whether to include cropper results for each page. This performs a cropping operation on the server and will increase response time.endpoint (
Optional
[Endpoint
], default:None
) – For custom endpoints, an endpoint has to be given.full_text (
bool
, default:False
) – Whether to include the full OCR text response in compatible APIs.
- Return type:
- enqueue_and_parse(product_class, input_source, include_words=False, close_file=True, page_options=None, cropper=False, endpoint=None, initial_delay_sec=2, delay_sec=1.5, max_retries=30, full_text=False)
Enqueues to an asynchronous endpoint and automatically polls for a response.
- Parameters:
product_class (
Type
[Inference
]) – The document class to use. The response object will be instantiated based on thisparameter.input_source (
Union
[LocalInputSource
,UrlInputSource
]) – The document/source file to use. Has to be created beforehand.include_words (
bool
, default:False
) – Whether to include the full text for each page. This performs a full OCR operation on the server and will increase response time.close_file (
bool
, default:True
) – Whether toclose()
the file after parsing it. Set toFalse
if you need to access the file after this operation.page_options (
Optional
[PageOptions
], default:None
) – If set, remove pages from the document as specified. This is done before sending the file to the server. It is useful to avoid page limitations.cropper (
bool
, default:False
) – Whether to include cropper results for each page. This performs a cropping operation on the server and will increase response time.endpoint (
Optional
[Endpoint
], default:None
) – For custom endpoints, an endpoint has to be given.initial_delay_sec (
float
, default:2
) – Delay between each polling attempts This should not be shorter than 1 second.delay_sec (
float
, default:1.5
) – Delay between each polling attempts This should not be shorter than 1 second.max_retries (
int
, default:30
) – Total amount of polling attempts.full_text (
bool
, default:False
) – Whether to include the full OCR text response in compatible APIs.
- Return type:
- load_prediction(product_class, local_response)
Load a prediction.
- Parameters:
product_class (
Type
[Inference
]) – Class of the product to use.local_response (
LocalResponse
) – Local response to load.
- Return type:
Union
[AsyncPredictResponse
,PredictResponse
]- Returns:
A valid prediction.
- parse(product_class, input_source, include_words=False, close_file=True, page_options=None, cropper=False, endpoint=None, full_text=False)
Call prediction API on the document and parse the results.
- Parameters:
product_class (
Type
[Inference
]) – The document class to use. The response object will be instantiated based on this parameter.input_source (
Union
[LocalInputSource
,UrlInputSource
]) – The document/source file to use. Has to be created beforehand.include_words (
bool
, default:False
) – Whether to include the full text for each page. This performs a full OCR operation on the server and will increase response time. Only available on financial document APIs.close_file (
bool
, default:True
) – Whether toclose()
the file after parsing it. Set toFalse
if you need to access the file after this operation.page_options (
Optional
[PageOptions
], default:None
) – If set, remove pages from the document as specified. This is done before sending the file to the server. It is useful to avoid page limitations.cropper (
bool
, default:False
) – Whether to include cropper results for each page. This performs a cropping operation on the server and will increase response time.endpoint (
Optional
[Endpoint
], default:None
) – For custom endpoints, an endpoint has to be given.full_text (
bool
, default:False
) – Whether to include the full OCR text response in compatible APIs.
- Return type:
PredictResponse
- parse_queued(product_class, queue_id, endpoint=None)
Parses a queued document.
- Parameters:
product_class (
Type
[Inference
]) – The document class to use. The response object will be instantiated based on this parameter.queue_id (
str
) – queue_id received from the API.endpoint (
Optional
[Endpoint
], default:None
) – For custom endpoints, an endpoint has to be given.
- Return type:
- send_feedback(product_class, document_id, feedback, endpoint=None)
Send a feedback for a document.
- Parameters:
product_class (
Type
[Inference
]) – The document class to use. The response object will be instantiated based on this parameter.document_id (
str
) – The id of the document to send feedback to.feedback (
Dict
[str
,Any
]) – Feedback to send.endpoint (
Optional
[Endpoint
], default:None
) – For custom endpoints, an endpoint has to be given.
- Return type:
- source_from_b64string(input_string, filename, fix_pdf=False)
Load a document from a base64 encoded string.
- Parameters:
input_string (
str
) – Input to parse as base64 stringfilename (
str
) – The name of the file (without the path)fix_pdf (
bool
, default:False
) – Whether to attempt fixing PDF files before sending. Setting this to True can modify the data sent to Mindee.
- Return type:
- source_from_bytes(input_bytes, filename, fix_pdf=False)
Load a document from raw bytes.
- Parameters:
input_bytes (
bytes
) – Raw byte inputfilename (
str
) – The name of the file (without the path)fix_pdf (
bool
, default:False
) – Whether to attempt fixing PDF files before sending. Setting this to True can modify the data sent to Mindee.
- Return type:
- source_from_file(input_file, fix_pdf=False)
Load a document from a normal Python file object/handle.
- Parameters:
input_file (
BinaryIO
) – Input file handlefix_pdf (
bool
, default:False
) – Whether to attempt fixing PDF files before sending. Setting this to True can modify the data sent to Mindee.
- Return type:
- source_from_path(input_path, fix_pdf=False)
Load a document from an absolute path, as a string.
- Parameters:
input_path (
Union
[Path
,str
]) – Path of file to openfix_pdf (
bool
, default:False
) – Whether to attempt fixing PDF files before sending. Setting this to True can modify the data sent to Mindee.
- Return type:
- source_from_url(url)
Load a document from a URL.
- Parameters:
url (
str
) – Raw byte input- Return type: