Documentation

PdfExtractor
in package

PDF extraction class.

Table of Contents

Properties

$fileName  : string
$pdfBytes  : string

Methods

__construct()  : mixed
extractInvoices()  : array<string|int, mixed>
Extracts invoices as complete PDFs from the document.
extractSubDocuments()  : array<string|int, mixed>
Extracts sub-documents from the source document using list of page indexes.
getFileName()  : string
getPageCount()  : int
Wrapper for pdf GetPageCount().

Properties

$pdfBytes

private string $pdfBytes

Bytes representation of a file.

Methods

extractInvoices()

Extracts invoices as complete PDFs from the document.

public extractInvoices(array<string|int, mixed> $pageIndexes[, bool $strict = false ]) : array<string|int, mixed>
Parameters
$pageIndexes : array<string|int, mixed>

List of sub-lists of pages to keep.

$strict : bool = false

Whether to trust confidence scores of 1.0 only or not.

Return values
array<string|int, mixed>

A list of extracted invoices.

extractSubDocuments()

Extracts sub-documents from the source document using list of page indexes.

public extractSubDocuments(array<string|int, mixed> $pageIndexes) : array<string|int, mixed>
Parameters
$pageIndexes : array<string|int, mixed>

List of sub-lists of pages to keep.

Tags
throws
MindeePDFException

Throws if FDPF/FPDI wasn't able to handle the pdf during the extraction.

throws
InvalidArgumentException

Throws if invalid indexes are provided.

Return values
array<string|int, mixed>

List of extracted documents.

getFileName()

public getFileName() : string
Return values
string

Name of the file.

getPageCount()

Wrapper for pdf GetPageCount().

public getPageCount() : int
Tags
throws
MindeePDFException

Throws if FPDI is unable to process the file.

Return values
int

The number of pages in the file.


        
On this page

Search results