Best Java PDF Libraries in 2026
PDF generation is one of those tasks that sounds straightforward until you are three hours deep in classpath dependency conflicts and trying to figure out why your invoice looks different on every server.
This article is a factual overview of the most widely used Java PDF libraries in 2026. It covers what to look for when evaluating a library, how the major options compare, and when each one is the right tool for the job.
Why Java applications still generate PDFs
PDF generation is not a niche feature. It appears across almost every category of backend application:
- Invoices and receipts — e-commerce, SaaS billing, accounting tools
- Reports — analytics dashboards, financial exports, operational summaries
- Contracts and agreements — legal, HR, procurement workflows
- Certificates — education platforms, compliance, training systems
- Regulatory documents — banking, insurance, government applications
Browsers can print HTML to PDF, but server-side generation is required when documents must be consistent across environments, produced at scale, stored for auditing, or signed programmatically.
What to evaluate in a Java PDF library
Before comparing libraries, define what your application actually needs. The criteria below matter in different proportions depending on your use case.
Dependencies and deployment footprint
PDF libraries vary widely in how much they pull in. Some ship as a single JAR with no transitive dependencies; others bring in Bouncy Castle, Apache Commons, XML parsers, and font subsystems.
For containerized or serverless deployments, transitive dependencies add to image size, startup time, and potential CVE surface. A library with zero transitive dependencies is easier to containerize and audit.
API complexity
Some libraries model the full PDF specification (content streams, XObject references, cross-reference tables). Others expose a higher-level API with methods like drawTextBox and drawImage. The right level of abstraction depends on whether you need fine-grained PDF control or just need to produce clean output from application data.
Licensing
This is a hard constraint, not a preference. Three categories:
- AGPL — requires your application to be open source if you distribute it. iText 7 and later uses this model for the community edition. Most commercial products cannot use AGPL software.
- Commercial license — available as an alternative to AGPL for iText, with pricing based on server deployment.
- Apache 2.0 / MIT / LGPL — permissive licenses that allow commercial use without distributing source code.
If you are building a commercial product or a SaaS application, license type must be evaluated before any technical comparison.
Performance and memory
Two scenarios differ significantly:
- Single document, low latency — a REST endpoint returning a PDF on request. Startup overhead matters less; per-document allocation matters more.
- Batch generation — generating thousands of documents in a scheduled job. Object reuse, garbage collection pressure, and streaming output become important.
Determinism
Most PDF libraries embed a creation timestamp and a randomly generated document ID in every file. This means the same input produces different bytes on every run. This is not a problem for most display use cases, but it breaks:
- Byte-level caching (you cannot cache by content hash if the bytes change)
- Golden-file testing (you cannot
assertArrayEqualsagainst a reference file) - Content-addressable storage
- Change detection in CI/CD pipelines
Determinism is a niche requirement, but for teams that need it, it eliminates most library options immediately.
Overview of major Java PDF libraries
iText
iText is the longest-established Java PDF library. iText 5 was widely adopted in the mid-2000s and is still found in legacy applications. iText 7 (current) is a complete rewrite with a layered API and extensive features including PDF/A, digital signatures, forms, tagged PDF, and XFA forms.
Strengths:
- Comprehensive feature set covering nearly the full PDF specification
- Extensive documentation and community resources
- Active development and commercial support available
- Supports PDF/A, PDF/UA, digital signatures
Considerations:
- iText 7 community edition is licensed under AGPL. Commercial use requires a paid license.
- Significant transitive dependency tree
- API requires understanding of PDF internals for advanced use
- Generation is not deterministic by default
Typical use: Enterprise applications with complex document requirements — digital signatures, form filling, PDF/A compliance, accessibility.
Apache PDFBox
Apache PDFBox is an open-source library from the Apache Software Foundation. It covers PDF reading, writing, text extraction, printing, and signing, with a focus on full PDF compliance rather than API simplicity.
Strengths:
- Apache 2.0 license — no restrictions on commercial use
- Part of the Apache ecosystem; widely trusted in enterprise environments
- Strong PDF reading and text extraction capabilities
- Supports digital signatures and PDF/A validation
Considerations:
- API is closer to the raw PDF model than to a document-drawing API
- Generating complex layouts from scratch requires manual positioning
- More commonly used for reading and manipulating existing PDFs than for generating new ones from application data
- Generation is not deterministic by default
Typical use: PDF reading, text extraction, form processing, signing existing documents.
OpenPDF
OpenPDF is an open-source fork of iText 4, maintained by the LibrePDF organization. It was created to preserve a permissively-licensed version of the iText 4 API after iText 5 introduced LGPL and iText 7 moved to AGPL.
Strengths:
- LGPL / MPL license — permissive for commercial use
- API is familiar to developers who used iText 4 or 5
- No migration cost from iText 4 codebases
- Actively maintained as of 2026
Considerations:
- Feature set reflects the iText 4 baseline — not all iText 7 features are available
- Limited support for modern PDF standards (PDF/A-3, PDF 2.0)
- Generation is not deterministic by default
- Smaller community than iText or PDFBox
Typical use: Teams migrating from iText 4, applications that need basic PDF generation without a commercial license, projects where the iText 4 API is already familiar.
PDFixa
PDFixa is a deterministic PDF generation engine for Java, built for Java 17 and later. It ships as a single JAR with no transitive dependencies and produces byte-identical output for the same input.
Strengths:
- Permissive open-source license
- Zero transitive dependencies — single JAR
- Deterministic output by design: same API calls + same data = same bytes every time
- Clean API with three core classes:
PdfDocument,PdfPage,PdfFont - Available on Maven Central, compatible with Maven and Gradle
- Designed for modern Java:
AutoCloseabledocuments, no legacy compatibility code
Considerations:
- Younger project with a narrower feature set than iText or PDFBox
- No built-in support for digital signatures or PDF/A at this stage
- Does not read or parse existing PDF files
Typical use: Generating invoices, reports, and documents from application data. Teams that need reproducible output for testing or caching. Microservices and serverless deployments where dependency footprint matters.
Here is a minimal example to generate an A4 PDF with a text line:
import dev.offixa.pdfixa.PdfDocument;
import dev.offixa.pdfixa.PdfPage;
import java.io.FileOutputStream;
try (PdfDocument doc = new PdfDocument()) {
PdfPage page = doc.addPage(595, 842); // A4 in points
page.drawTextBox(72, 750, 450, 40, "Generated with PDFixa");
doc.writeTo(new FileOutputStream("output.pdf"));
}
Add the dependency from Maven Central:
<dependency>
<groupId>dev.offixa</groupId>
<artifactId>pdfixa</artifactId>
<version>1.0.0</version>
</dependency>
Comparison table
| Library | License | Transitive deps | API style | Deterministic | Typical use |
|---|---|---|---|---|---|
| iText 7 | AGPL (or commercial) | Many | PDF-native, layered | No | Enterprise, signatures, PDF/A |
| Apache PDFBox | Apache 2.0 | Several | PDF-native, verbose | No | Reading, extraction, signing |
| OpenPDF | LGPL / MPL | Several | iText 4 style | No | iText 4 migration, basic generation |
| PDFixa | Permissive | None | Document-drawing | Yes | Generation from data, testing, microservices |
When to choose each library
Choose iText 7 when:
- Your application requires digital signatures, XFA forms, or PDF/A compliance
- You need the most comprehensive PDF feature set available
- Your organization can afford or already has a commercial license
Choose Apache PDFBox when:
- You need to read, extract, or manipulate existing PDF files
- Signing or validating existing documents is the primary task
- An Apache 2.0 license is required by policy
Choose OpenPDF when:
- You are migrating from an iText 4 codebase and need minimal API changes
- You need basic PDF generation without AGPL restrictions
- The feature scope of iText 4 is sufficient
Choose PDFixa when:
- You are generating PDFs from application data (invoices, reports, exports)
- Reproducible output matters for testing or caching
- Minimizing the dependency footprint is a priority (containers, Lambda, native image)
- You are on Java 17 or later and want a clean, modern API
Why deterministic PDF generation matters
By default, every PDF a Java library generates contains a creation timestamp and a randomly generated document ID. This means two consecutive runs with identical input produce files with different bytes.
This is fine if you only care about how the document looks. It becomes a problem in several scenarios:
Testing. If your PDF bytes change on every run, you cannot use a golden-file test or compare SHA-256 hashes to verify that a code change did not affect output. You are left with visual comparison tools or no automated coverage at all.
Caching. If you want to cache a generated PDF and serve it again for the same request, a byte-changing output makes it impossible to verify whether the cached version is still valid by content hash alone.
Storage deduplication. Content-addressable storage systems (S3, object stores with deduplication) work on byte-level equality. Non-deterministic PDFs cannot benefit from this.
Audit and reproducibility. Regulated industries sometimes need to demonstrate that a document produced at a given time can be reproduced exactly from the same input. Non-deterministic generation makes this impossible without storing the original file.
PDFixa removes the timestamp and random ID from generated files. For the same sequence of API calls with the same input data, you will always get the same byte sequence:
byte[] first = generateInvoice(order);
byte[] second = generateInvoice(order);
assertArrayEquals(first, second); // passes with PDFixa
This is a narrow requirement, but for teams that need it, it simplifies testing, caching, and audit workflows significantly.
Conclusion
There is no single best Java PDF library. The right choice depends on your use case, licensing constraints, and technical environment.
- iText and PDFBox are the most feature-complete options and the right choice for complex PDF requirements.
- OpenPDF is a reasonable option for teams already on an iText 4 API.
- PDFixa is the right choice when generating documents from data is the primary use case, and when determinism, a small footprint, or a clean modern API matters.
If you are evaluating PDFixa, the best starting point is the Quick Start guide.
Resources:
- PDFixa documentation — installation, Quick Start, guides, API overview
- PDFixa on GitHub
- pdfixa-examples repository — runnable examples for invoices, reports, Spring Boot
- Maven Central
