File Operations
Crop
- extract_multiple_crops(input_source, crops)
Extracts individual receipts from multi-receipts documents.
- Parameters:
input_source (
LocalInputSource) – Local Input Source to extract sub-receipts from.crops (
list[CropItem]) – List of crops.
- Return type:
- Returns:
Individual extracted receipts as an array of ExtractedImage.
- extract_single_crop(input_source, crop)
Extracts a single crop as complete PDFs from the document.
- Parameters:
input_source (
LocalInputSource) – Local Input Source to extract sub-receipts from.crop (
FieldLocation) – Crop to extract.
- Return type:
ExtractedImage- Returns:
ExtractedImage.
Crop Files
- class CropFiles(iterable=(), /)
Crop files.
- append(object, /)
Append object to the end of the list.
- clear()
Remove all items from list.
- copy()
Return a shallow copy of the list.
- count(value, /)
Return number of occurrences of value.
- extend(iterable, /)
Extend list by appending elements from the iterable.
- index(value, start=0, stop=9223372036854775807, /)
Return first index of value.
Raises ValueError if the value is not present.
- insert(index, object, /)
Insert object before index.
- pop(index=-1, /)
Remove and return item at index (default last).
Raises IndexError if list is empty or index is out of range.
- remove(value, /)
Remove first occurrence of value.
Raises ValueError if the value is not present.
- reverse()
Reverse IN PLACE.
- save_all_to_disk(path, prefix='crop')
Save all extracted crops to disk.
- Parameters:
path (
Path|str) – Path to save the extracted splits to.prefix (
str, default:'crop') – Prefix to add to the filename, defaults to ‘crop’.
- sort(*, key=None, reverse=False)
Sort the list in ascending order and return None.
The sort is in-place (i.e. the list itself is modified) and stable (i.e. the order of two equal elements is maintained).
If a key function is given, apply it once to each list item and sort them, ascending or descending, according to their function values.
The reverse flag can be set to sort in descending order.
Split
- extract_multiple_splits(input_source, splits)
Extracts splits as complete PDFs from the document.
- Parameters:
input_source (
LocalInputSource) – Input source to split.splits (
list[list[int]]) – List of sub-lists of pages to keep.
- Return type:
- Returns:
A list of extracted invoices.
- extract_single_split(input_source, split)
Extracts a single split as a complete PDF from the document.
- Parameters:
input_source (
LocalInputSource) – Input source to split.split (
list[int]) – List of pages to keep.
- Return type:
ExtractedPDF- Returns:
Extracted PDF
Split Files
- class SplitFiles(iterable=(), /)
Split files.
- append(object, /)
Append object to the end of the list.
- clear()
Remove all items from list.
- copy()
Return a shallow copy of the list.
- count(value, /)
Return number of occurrences of value.
- extend(iterable, /)
Extend list by appending elements from the iterable.
- index(value, start=0, stop=9223372036854775807, /)
Return first index of value.
Raises ValueError if the value is not present.
- insert(index, object, /)
Insert object before index.
- pop(index=-1, /)
Remove and return item at index (default last).
Raises IndexError if list is empty or index is out of range.
- remove(value, /)
Remove first occurrence of value.
Raises ValueError if the value is not present.
- reverse()
Reverse IN PLACE.
- save_all_to_disk(path, prefix='split')
Save all extracted splits to disk.
- Parameters:
path (
str|Path) – Path to save the extracted splits to.prefix (
str, default:'split') – Prefix to add to the filename, defaults to ‘split’.
- sort(*, key=None, reverse=False)
Sort the list in ascending order and return None.
The sort is in-place (i.e. the list itself is modified) and stable (i.e. the order of two equal elements is maintained).
If a key function is given, apply it once to each list item and sort them, ascending or descending, according to their function values.
The reverse flag can be set to sort in descending order.