Package com.mindee.pdf
Class PDFUtils
- java.lang.Object
-
- com.mindee.pdf.PDFUtils
-
public final class PDFUtils extends Object
Utilities for working with PDFs.
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static voidaddImageToPage(org.apache.pdfbox.pdmodel.PDPageContentStream contentStream, org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject pdImage, org.apache.pdfbox.pdmodel.common.PDRectangle pageSize)static byte[]documentToBytes(org.apache.pdfbox.pdmodel.PDDocument document)static voidextractAndAddText(org.apache.pdfbox.pdmodel.PDDocument inputDoc, org.apache.pdfbox.pdmodel.PDPageContentStream contentStream, int pageIndex, boolean disableSourceText)static intgetNumberOfPages(byte[] pdfBytes)Get the number of pages in the PDF.static intgetNumberOfPages(LocalInputSource inputSource)Get the number of pages in the PDF.static booleanisPdfEmpty(File file)static byte[]mergePdfPages(File file, List<Integer> pageNumbers)Merge specified PDF pages together.static byte[]mergePdfPages(org.apache.pdfbox.pdmodel.PDDocument document, List<Integer> pageNumbers)static byte[]mergePdfPages(org.apache.pdfbox.pdmodel.PDDocument document, List<Integer> pageNumbers, boolean closeOriginal)static PdfPageImagepdfPageToImage(LocalInputSource source, int pageNumber)Render a single page of a PDF as an image.static PdfPageImagepdfPageToImage(String filePath, int pageNumber)Render a single page of a PDF as an image.static List<PdfPageImage>pdfToImages(LocalInputSource source)Render all pages of a PDF as images.static List<PdfPageImage>pdfToImages(String filePath)Render all pages of a PDF as images.
-
-
-
Method Detail
-
getNumberOfPages
public static int getNumberOfPages(LocalInputSource inputSource) throws IOException
Get the number of pages in the PDF.- Parameters:
inputSource- The PDF file.- Throws:
IOException
-
getNumberOfPages
public static int getNumberOfPages(byte[] pdfBytes) throws IOExceptionGet the number of pages in the PDF.- Parameters:
pdfBytes- The PDF file as a byte array.- Throws:
IOException
-
mergePdfPages
public static byte[] mergePdfPages(File file, List<Integer> pageNumbers) throws IOException
Merge specified PDF pages together.- Parameters:
file- The PDF file.pageNumbers- Lit of page numbers to merge together.- Throws:
IOException
-
mergePdfPages
public static byte[] mergePdfPages(org.apache.pdfbox.pdmodel.PDDocument document, List<Integer> pageNumbers) throws IOException- Throws:
IOException
-
mergePdfPages
public static byte[] mergePdfPages(org.apache.pdfbox.pdmodel.PDDocument document, List<Integer> pageNumbers, boolean closeOriginal) throws IOException- Throws:
IOException
-
isPdfEmpty
public static boolean isPdfEmpty(File file) throws IOException
- Throws:
IOException
-
pdfToImages
public static List<PdfPageImage> pdfToImages(String filePath) throws IOException
Render all pages of a PDF as images. Converting PDFs with hundreds of pages may result in a heap space error.- Parameters:
filePath- The path to the PDF file.- Returns:
- List of all pages as images.
- Throws:
IOException
-
pdfToImages
public static List<PdfPageImage> pdfToImages(LocalInputSource source) throws IOException
Render all pages of a PDF as images. Converting PDFs with hundreds of pages may result in a heap space error.- Parameters:
source- The PDF file.- Returns:
- List of all pages as images.
- Throws:
IOException
-
pdfPageToImage
public static PdfPageImage pdfPageToImage(String filePath, int pageNumber) throws IOException
Render a single page of a PDF as an image. Main use case is for processing PDFs with hundreds of pages. If you need to only render some pages from the PDF, usemergePdfPagesand thenpdfToImages.- Parameters:
filePath- The path to the PDF file.pageNumber- The page number to render, first page is 1.- Returns:
- The page as an image.
- Throws:
IOException
-
pdfPageToImage
public static PdfPageImage pdfPageToImage(LocalInputSource source, int pageNumber) throws IOException
Render a single page of a PDF as an image. Main use case is for processing PDFs with hundreds of pages. If you need to only render some pages from the PDF, usemergePdfPagesand thenpdfToImages.- Parameters:
source- The PDF file.pageNumber- The page number to render, first page is 1.- Returns:
- The page as an image.
- Throws:
IOException
-
documentToBytes
public static byte[] documentToBytes(org.apache.pdfbox.pdmodel.PDDocument document) throws IOException- Throws:
IOException
-
extractAndAddText
public static void extractAndAddText(org.apache.pdfbox.pdmodel.PDDocument inputDoc, org.apache.pdfbox.pdmodel.PDPageContentStream contentStream, int pageIndex, boolean disableSourceText) throws IOException- Throws:
IOException
-
addImageToPage
public static void addImageToPage(org.apache.pdfbox.pdmodel.PDPageContentStream contentStream, org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject pdImage, org.apache.pdfbox.pdmodel.common.PDRectangle pageSize) throws IOException- Throws:
IOException
-
-