Package com.mindee.pdf
Class PDFUtils
- java.lang.Object
-
- com.mindee.pdf.PDFUtils
-
public final class PDFUtils extends Object
Utilities for working with PDFs.
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static void
addImageToPage(org.apache.pdfbox.pdmodel.PDPageContentStream contentStream, org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject pdImage, org.apache.pdfbox.pdmodel.common.PDRectangle pageSize)
static byte[]
documentToBytes(org.apache.pdfbox.pdmodel.PDDocument document)
static void
extractAndAddText(org.apache.pdfbox.pdmodel.PDDocument inputDoc, org.apache.pdfbox.pdmodel.PDPageContentStream contentStream, int pageIndex, boolean disableSourceText)
static int
getNumberOfPages(LocalInputSource inputSource)
Get the number of pages in the PDF.static boolean
isPdfEmpty(File file)
static byte[]
mergePdfPages(File file, List<Integer> pageNumbers)
Merge specified PDF pages together.static byte[]
mergePdfPages(org.apache.pdfbox.pdmodel.PDDocument document, List<Integer> pageNumbers)
static byte[]
mergePdfPages(org.apache.pdfbox.pdmodel.PDDocument document, List<Integer> pageNumbers, boolean closeOriginal)
static PdfPageImage
pdfPageToImage(LocalInputSource source, int pageNumber)
Render a single page of a PDF as an image.static PdfPageImage
pdfPageToImage(String filePath, int pageNumber)
Render a single page of a PDF as an image.static List<PdfPageImage>
pdfToImages(LocalInputSource source)
Render all pages of a PDF as images.static List<PdfPageImage>
pdfToImages(String filePath)
Render all pages of a PDF as images.
-
-
-
Method Detail
-
getNumberOfPages
public static int getNumberOfPages(LocalInputSource inputSource) throws IOException
Get the number of pages in the PDF.- Parameters:
inputSource
- The PDF file.- Throws:
IOException
-
mergePdfPages
public static byte[] mergePdfPages(File file, List<Integer> pageNumbers) throws IOException
Merge specified PDF pages together.- Parameters:
file
- The PDF file.pageNumbers
- Lit of page numbers to merge together.- Throws:
IOException
-
mergePdfPages
public static byte[] mergePdfPages(org.apache.pdfbox.pdmodel.PDDocument document, List<Integer> pageNumbers) throws IOException
- Throws:
IOException
-
mergePdfPages
public static byte[] mergePdfPages(org.apache.pdfbox.pdmodel.PDDocument document, List<Integer> pageNumbers, boolean closeOriginal) throws IOException
- Throws:
IOException
-
isPdfEmpty
public static boolean isPdfEmpty(File file) throws IOException
- Throws:
IOException
-
pdfToImages
public static List<PdfPageImage> pdfToImages(String filePath) throws IOException
Render all pages of a PDF as images. Converting PDFs with hundreds of pages may result in a heap space error.- Parameters:
filePath
- The path to the PDF file.- Returns:
- List of all pages as images.
- Throws:
IOException
-
pdfToImages
public static List<PdfPageImage> pdfToImages(LocalInputSource source) throws IOException
Render all pages of a PDF as images. Converting PDFs with hundreds of pages may result in a heap space error.- Parameters:
source
- The PDF file.- Returns:
- List of all pages as images.
- Throws:
IOException
-
pdfPageToImage
public static PdfPageImage pdfPageToImage(String filePath, int pageNumber) throws IOException
Render a single page of a PDF as an image. Main use case is for processing PDFs with hundreds of pages. If you need to only render some pages from the PDF, usemergePdfPages
and thenpdfToImages
.- Parameters:
filePath
- The path to the PDF file.pageNumber
- The page number to render, first page is 1.- Returns:
- The page as an image.
- Throws:
IOException
-
pdfPageToImage
public static PdfPageImage pdfPageToImage(LocalInputSource source, int pageNumber) throws IOException
Render a single page of a PDF as an image. Main use case is for processing PDFs with hundreds of pages. If you need to only render some pages from the PDF, usemergePdfPages
and thenpdfToImages
.- Parameters:
source
- The PDF file.pageNumber
- The page number to render, first page is 1.- Returns:
- The page as an image.
- Throws:
IOException
-
documentToBytes
public static byte[] documentToBytes(org.apache.pdfbox.pdmodel.PDDocument document) throws IOException
- Throws:
IOException
-
extractAndAddText
public static void extractAndAddText(org.apache.pdfbox.pdmodel.PDDocument inputDoc, org.apache.pdfbox.pdmodel.PDPageContentStream contentStream, int pageIndex, boolean disableSourceText) throws IOException
- Throws:
IOException
-
addImageToPage
public static void addImageToPage(org.apache.pdfbox.pdmodel.PDPageContentStream contentStream, org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject pdImage, org.apache.pdfbox.pdmodel.common.PDRectangle pageSize) throws IOException
- Throws:
IOException
-
-