Documentation

PdfExtractor
in package

PDF extraction class.

Table of Contents

Properties

$fileName  : string
$pdfBytes  : string

Methods

__construct()  : mixed
extractInvoices()  : array<string|int, ExtractedPdf>
Extracts invoices as complete PDFs from the document.
extractSubDocuments()  : array<string|int, ExtractedPdf>
Extracts sub-documents from the source document using list of page indexes.
getFileName()  : string
getPageCount()  : int
Wrapper for pdf GetPageCount().

Properties

$pdfBytes

private string $pdfBytes

bytes representation of a file

Methods

extractInvoices()

Extracts invoices as complete PDFs from the document.

public extractInvoices(array<string|int, mixed>|InvoiceSplitterV1InvoicePageGroups $pageIndexes[, bool $strict = false ]) : array<string|int, ExtractedPdf>
Parameters
$pageIndexes : array<string|int, mixed>|InvoiceSplitterV1InvoicePageGroups

List of sub-lists of pages to keep.

$strict : bool = false

Whether to trust confidence scores or not.

Return values
array<string|int, ExtractedPdf>

a list of extracted invoices

extractSubDocuments()

Extracts sub-documents from the source document using list of page indexes.

public extractSubDocuments(array<string|int, mixed>|InvoiceSplitterV1InvoicePageGroups $pageIndexes) : array<string|int, ExtractedPdf>
Parameters
$pageIndexes : array<string|int, mixed>|InvoiceSplitterV1InvoicePageGroups

List of sub-lists of pages to keep.

Tags
throws
MindeePDFException

Throws if FDPF/FPDI wasn't able to handle the pdf during the extraction.

throws
InvalidArgumentException

Throws if invalid indexes are provided.

Return values
array<string|int, ExtractedPdf>

list of extracted documents

getFileName()

public getFileName() : string
Return values
string

name of the file

getPageCount()

Wrapper for pdf GetPageCount().

public getPageCount() : int
Tags
throws
MindeePDFException

Throws if FPDI is unable to process the file.

Return values
int

The number of pages in the file.


        
On this page

Search results