Class: Mindee::Client
- Inherits:
-
Object
- Object
- Mindee::Client
- Defined in:
- lib/mindee/client.rb
Overview
Mindee API Client. See: developers.mindee.com/docs
Instance Method Summary collapse
-
#create_endpoint(endpoint_name: '', account_name: '', version: '') ⇒ Mindee::HTTP::Endpoint
Creates a custom endpoint with the given values.
-
#enqueue(input_source, product_class, endpoint: nil, options: {}) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for async parsing.
-
#enqueue_and_parse(input_source, product_class, endpoint, options) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for async parsing and automatically try to retrieve it.
-
#execute_workflow(input_source, workflow_id, options: {}) ⇒ Mindee::Parsing::Common::WorkflowResponse
Sends a document to a workflow.
-
#initialize(api_key: '') ⇒ Client
constructor
A new instance of Client.
-
#load_prediction(product_class, local_response) ⇒ Mindee::Parsing::Common::ApiResponse
Load a prediction.
-
#parse(input_source, product_class, endpoint: nil, options: {}, enqueue: true) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for parsing and automatically try to retrieve it if needed.
-
#parse_queued(job_id, product_class, endpoint: nil) ⇒ Mindee::Parsing::Common::ApiResponse
Parses a queued document.
-
#source_from_b64string(base64_string, filename, repair_pdf: false) ⇒ Mindee::Input::Source::Base64InputSource
Load a document from a base64 encoded string.
-
#source_from_bytes(input_bytes, filename, repair_pdf: false) ⇒ Mindee::Input::Source::BytesInputSource
Load a document from raw bytes.
-
#source_from_file(input_file, filename, repair_pdf: false) ⇒ Mindee::Input::Source::FileInputSource
Load a document from a normal Ruby
File
. -
#source_from_path(input_path, repair_pdf: false) ⇒ Mindee::Input::Source::PathInputSource
Load a document from an absolute path, as a string.
-
#source_from_url(url) ⇒ Mindee::Input::Source::URLInputSource
Load a document from a secure remote source (HTTPS).
Constructor Details
#initialize(api_key: '') ⇒ Client
Returns a new instance of Client.
110 111 112 |
# File 'lib/mindee/client.rb', line 110 def initialize(api_key: '') @api_key = api_key end |
Instance Method Details
#create_endpoint(endpoint_name: '', account_name: '', version: '') ⇒ Mindee::HTTP::Endpoint
Creates a custom endpoint with the given values. Do not set for standard (off the shelf) endpoints.
410 411 412 413 414 415 416 417 |
# File 'lib/mindee/client.rb', line 410 def create_endpoint(endpoint_name: '', account_name: '', version: '') initialize_endpoint( Mindee::Product::Universal::Universal, endpoint_name: endpoint_name, account_name: account_name, version: version ) end |
#enqueue(input_source, product_class, endpoint: nil, options: {}) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for async parsing
212 213 214 215 216 217 218 219 220 221 222 223 224 225 |
# File 'lib/mindee/client.rb', line 212 def enqueue(input_source, product_class, endpoint: nil, options: {}) opts = () endpoint ||= initialize_endpoint(product_class) logger.debug("Enqueueing document as '#{endpoint.url_root}'") prediction, raw_http = endpoint.predict_async( input_source, opts.all_words, opts.full_text, opts.close_file, opts.cropper ) Mindee::Parsing::Common::ApiResponse.new(product_class, prediction, raw_http) end |
#enqueue_and_parse(input_source, product_class, endpoint, options) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for async parsing and automatically try to retrieve it
269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 |
# File 'lib/mindee/client.rb', line 269 def enqueue_and_parse(input_source, product_class, endpoint, ) validate_async_params(.initial_delay_sec, .delay_sec, .max_retries) enqueue_res = enqueue(input_source, product_class, endpoint: endpoint, options: ) job = enqueue_res.job or raise Errors::MindeeAPIError, 'Expected job to be present' job_id = job.id sleep(.initial_delay_sec) polling_attempts = 1 logger.debug("Successfully enqueued document with job id: '#{job_id}'") queue_res = parse_queued(job_id, product_class, endpoint: endpoint) queue_res_job = queue_res.job or raise Errors::MindeeAPIError, 'Expected job to be present' valid_statuses = [ Mindee::Parsing::Common::JobStatus::WAITING, Mindee::Parsing::Common::JobStatus::PROCESSING, ] # @type var valid_statuses: Array[(:waiting | :processing | :completed | :failed)] while valid_statuses.include?(queue_res_job.status) && polling_attempts < .max_retries logger.debug("Polling server for parsing result with job id: '#{job_id}'. Attempt #{polling_attempts}") sleep(.delay_sec) queue_res = parse_queued(job_id, product_class, endpoint: endpoint) queue_res_job = queue_res.job or raise Errors::MindeeAPIError, 'Expected job to be present' polling_attempts += 1 end if queue_res_job.status != Mindee::Parsing::Common::JobStatus::COMPLETED elapsed = .initial_delay_sec + (polling_attempts * .delay_sec.to_f) raise Errors::MindeeAPIError, "Asynchronous parsing request timed out after #{elapsed} seconds (#{polling_attempts} tries)" end queue_res end |
#execute_workflow(input_source, workflow_id, options: {}) ⇒ Mindee::Parsing::Common::WorkflowResponse
Sends a document to a workflow.
Accepts options either as a Hash or as a WorkflowOptions struct.
requiring authentication. * page_options
[Hash, nil] Page cutting/merge options: * :page_indexes
Zero-based list of page indexes. * :operation
Operation to apply on the document, given the page_indexes specified: *
:KEEP_ONLY- keep only the specified pages, and remove all others. *
:REMOVE- remove the specified pages, and keep all others. *
:on_min_pages` Apply the operation only if document has at least this many pages.
323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 |
# File 'lib/mindee/client.rb', line 323 def execute_workflow(input_source, workflow_id, options: {}) opts = .is_a?(WorkflowOptions) ? : WorkflowOptions.new(params: ) if opts.respond_to?(:page_options) && input_source.is_a?(Input::Source::LocalInputSource) process_pdf_if_required(input_source, opts) end workflow_endpoint = Mindee::HTTP::WorkflowEndpoint.new(workflow_id, api_key: @api_key) logger.debug("Sending document to workflow '#{workflow_id}'") prediction, raw_http = workflow_endpoint.execute_workflow( input_source, opts ) Mindee::Parsing::Common::WorkflowResponse.new(Product::Universal::Universal, prediction, raw_http) end |
#load_prediction(product_class, local_response) ⇒ Mindee::Parsing::Common::ApiResponse
Load a prediction.
346 347 348 349 350 351 352 353 354 355 |
# File 'lib/mindee/client.rb', line 346 def load_prediction(product_class, local_response) raise Errors::MindeeAPIError, 'Expected LocalResponse to not be nil.' if local_response.nil? response_hash = local_response.as_hash || {} raise Errors::MindeeAPIError, 'Expected LocalResponse#as_hash to return a hash.' if response_hash.nil? Mindee::Parsing::Common::ApiResponse.new(product_class, response_hash, response_hash.to_json) rescue KeyError, Errors::MindeeAPIError raise Errors::MindeeInputError, 'No prediction found in local response.' end |
#parse(input_source, product_class, endpoint: nil, options: {}, enqueue: true) ⇒ Mindee::Parsing::Common::ApiResponse
Enqueue a document for parsing and automatically try to retrieve it if needed.
Accepts options either as a Hash or as a ParseOptions struct.
141 142 143 144 145 146 147 148 149 150 151 |
# File 'lib/mindee/client.rb', line 141 def parse(input_source, product_class, endpoint: nil, options: {}, enqueue: true) opts = () process_pdf_if_required(input_source, opts) if input_source.is_a?(Input::Source::LocalInputSource) endpoint ||= initialize_endpoint(product_class) if enqueue && product_class.has_async enqueue_and_parse(input_source, product_class, endpoint, opts) else parse_sync(input_source, product_class, endpoint, opts) end end |
#parse_queued(job_id, product_class, endpoint: nil) ⇒ Mindee::Parsing::Common::ApiResponse
Parses a queued document
Doesn’t need to be set in the case of OTS APIs.
235 236 237 238 239 240 |
# File 'lib/mindee/client.rb', line 235 def parse_queued(job_id, product_class, endpoint: nil) endpoint = initialize_endpoint(product_class) if endpoint.nil? logger.debug("Fetching queued document as '#{endpoint.url_root}'") prediction, raw_http = endpoint.parse_async(job_id) Mindee::Parsing::Common::ApiResponse.new(product_class, prediction, raw_http) end |
#source_from_b64string(base64_string, filename, repair_pdf: false) ⇒ Mindee::Input::Source::Base64InputSource
Load a document from a base64 encoded string.
379 380 381 |
# File 'lib/mindee/client.rb', line 379 def source_from_b64string(base64_string, filename, repair_pdf: false) Input::Source::Base64InputSource.new(base64_string, filename, repair_pdf: repair_pdf) end |
#source_from_bytes(input_bytes, filename, repair_pdf: false) ⇒ Mindee::Input::Source::BytesInputSource
Load a document from raw bytes.
370 371 372 |
# File 'lib/mindee/client.rb', line 370 def source_from_bytes(input_bytes, filename, repair_pdf: false) Input::Source::BytesInputSource.new(input_bytes, filename, repair_pdf: repair_pdf) end |
#source_from_file(input_file, filename, repair_pdf: false) ⇒ Mindee::Input::Source::FileInputSource
Load a document from a normal Ruby File
.
388 389 390 |
# File 'lib/mindee/client.rb', line 388 def source_from_file(input_file, filename, repair_pdf: false) Input::Source::FileInputSource.new(input_file, filename, repair_pdf: repair_pdf) end |
#source_from_path(input_path, repair_pdf: false) ⇒ Mindee::Input::Source::PathInputSource
Load a document from an absolute path, as a string.
361 362 363 |
# File 'lib/mindee/client.rb', line 361 def source_from_path(input_path, repair_pdf: false) Input::Source::PathInputSource.new(input_path, repair_pdf: repair_pdf) end |
#source_from_url(url) ⇒ Mindee::Input::Source::URLInputSource
Load a document from a secure remote source (HTTPS).
395 396 397 |
# File 'lib/mindee/client.rb', line 395 def source_from_url(url) Input::Source::URLInputSource.new(url) end |