Kotlin API Reference
PDF Oxide ships idiomatic Kotlin/JVM bindings (Android-ready) as a thin facade over the mature fyi.oxide:pdf-oxide Java binding, which owns the single JNI native bridge (the pdf_oxide_jni crate). The Kotlin module adds zero native code: it re-exports the Java types (PdfDocument, Pdf, PdfPage, DocumentEditor, PdfSigner, PdfValidator, AutoExtractor, and the geometry / text / table / search value types) and layers Kotlin sugar — Optional<T> to T? extensions and use { } on the AutoCloseable handles.
// build.gradle.kts
dependencies {
implementation("fyi.oxide:pdf-oxide-kotlin:0.3.69")
}
The JNI native library (libpdf_oxide_jni) is not bundled — load it via System.loadLibrary("pdf_oxide_jni") (ship the .so/.dylib on your java.library.path, or in jniLibs/<abi>/ on Android), or point the Java NativeLoader at it with -Dfyi.oxide.pdf.lib.path=<path>.
For the Java API, see the Java API Reference. For the Rust API, see the Rust API Reference. For type details, see Types & Enums.
import fyi.oxide.pdf.Pdf
import fyi.oxide.pdf.PdfDocument
import fyi.oxide.pdf.producerOrNull
Pdf.fromMarkdown("# Hello\n\nbody\n").use { pdf ->
PdfDocument.open(pdf.save()).use { doc ->
println(doc.pageCount())
println(doc.extractText(0))
println(doc.toMarkdown())
println(doc.page(0).words().map { it.text() })
println(doc.producerOrNull() ?: "(no producer)") // Optional -> nullable
}
}
All handles (PdfDocument, Pdf, DocumentEditor) implement AutoCloseable, so the Kotlin use { } block closes native memory deterministically. Errors raise PdfException (and its subclasses); see Exceptions.
PdfDocument
The primary read-only entry point to a PDF — open, extract, convert, render, search, and inspect form fields. Instances own native memory and must be closed; use use { }.
import fyi.oxide.pdf.PdfDocument
Factory Methods
PdfDocument.open(path: Path): PdfDocument
Open a PDF from a filesystem path.
PdfDocument.open(path: String): PdfDocument
Open a PDF from a path string.
PdfDocument.open(bytes: ByteArray): PdfDocument
Open a PDF from in-memory bytes (e.g. downloaded from S3 or received over HTTP).
PdfDocument.open(path: Path, password: String): PdfDocument
Open an encrypted PDF from a path with the user or owner password.
PdfDocument.open(path: String, password: String): PdfDocument
Open an encrypted PDF from a path string with a password.
PdfDocument.open(bytes: ByteArray, password: String): PdfDocument
Open an encrypted PDF from bytes with a password.
PdfDocument.open(stream: InputStream): PdfDocument
Open a PDF by reading all bytes from an InputStream.
Static One-Shots
PdfDocument.extractText(path: String): String
PdfDocument.extractText(path: Path): String
Open, extract all text, and close in a single call — for the simple case where you do not need a live handle.
Authentication
doc.authenticate(password: String): Boolean
doc.authenticate(password: ByteArray): Boolean
Authenticate an encrypted document after opening. Returns true if the password matched.
Document Info
doc.pageCount(): Int
Number of pages in the document.
doc.producer(): Optional<String>
doc.creator(): Optional<String>
Document /Producer and /Creator metadata. Use the Kotlin producerOrNull() / creatorOrNull() extensions for null-based access.
val doc.isOpen: Boolean
Whether the native handle is still open (Kotlin property over the Java isOpen() getter).
Text Extraction
doc.extractText(pageIndex: Int): String
Extract plain text from a single zero-indexed page.
doc.extractTextAuto(pageIndex: Int): String
Extract text with automatic strategy selection (falls back to OCR for scanned pages when the OCR feature is available).
doc.extractStructured(page: Int): String
Extract a structured (JSON) representation of the page’s text and layout.
Conversion
doc.toMarkdown(): String
doc.toMarkdown(pageIndex: Int): String
Convert the whole document or a single page to Markdown.
doc.toHtml(): String
doc.toHtml(pageIndex: Int): String
Convert the whole document or a single page to HTML.
Search
doc.search(query: String): List<SearchMatch>
Search the document for a literal string. Returns per-page matches with bounding boxes.
doc.search(query: String, caseInsensitive: Boolean, regex: Boolean, maxResults: Int): List<SearchMatch>
Search with case sensitivity, regex, and a result cap (maxResults = 0 means no cap).
Forms
doc.formFields(): List<FormField>
Get all AcroForm fields with their type, value, widget bounds, and page index. See FormField.
Rendering
doc.render(pageIndex: Int): ByteArray
doc.render(pageIndex: Int, dpi: Int): ByteArray
Render a page to PNG image bytes at the default DPI or a specified DPI.
Page Access
doc.page(index: Int): PdfPage
Get a lazy PdfPage handle for the given zero-based index.
doc.pages(): List<PdfPage>
Get all pages as a list.
doc.pagesStream(): Stream<PdfPage>
Get all pages as a Java Stream for fluent processing.
Lifecycle
doc.close()
Free native memory. Idempotent — a second call is a no-op. Prefer use { }.
PdfPage
A lazy page handle returned by PdfDocument.page(), pages(), or pagesStream(). All accessors dispatch to the parent document on access.
PdfDocument.open(bytes).use { doc ->
val page = doc.page(0)
val words = page.words()
val tables = page.tables()
}
Geometry
page.parent(): PdfDocument
page.index(): Int
page.mediaBox(): BBox
page.cropBox(): BBox
page.width(): Double
page.height(): Double
page.rotation(): Int
Parent document, zero-based index, MediaBox / CropBox rectangles, dimensions in PDF points, and page rotation in degrees.
Content Extraction
page.text(): String
Extract all text on the page.
page.text(region: BBox): String
Extract text within a bounding-box region.
page.words(): List<TextWord>
page.lines(): List<TextLine>
page.chars(): List<TextChar>
Structured text at word, line, and character granularity.
page.images(): List<ExtractedImage>
page.tables(): List<Table>
page.annotations(): List<Annotation>
Extracted images, detected tables, and page annotations.
Create PDFs from source formats, split by bookmarks, and serialize. Implements AutoCloseable.
import fyi.oxide.pdf.Pdf
Factory Methods
Pdf.fromMarkdown(markdown: String): Pdf
Create a PDF from Markdown content.
Pdf.fromHtml(html: String): Pdf
Create a PDF from HTML content.
Pdf.fromImages(images: List<ByteArray>): Pdf
Create a multi-page PDF from a list of image byte arrays, one page per image.
Splitting
pdf.planSplitByBookmarks(opts: SplitByBookmarksOptions): List<BookmarkSegment>
Plan a split by outline bookmarks without producing output — returns the segments (title, page range, filename) that would be created.
pdf.splitByBookmarks(opts: SplitByBookmarksOptions): List<ByteArray>
Split into multiple PDFs by bookmark level. Returns one byte array per segment.
Pdf.planSplitByBookmarksCount(sourcePdf: ByteArray, level: Int): Int
Static helper: count how many segments a bookmark split at the given level would produce.
Pdf.splitByBookmarksFromBytes(sourcePdf: ByteArray, level: Int): Array<ByteArray>
Static helper: split source PDF bytes by bookmark level directly.
Saving
pdf.save(): ByteArray
Serialize the PDF to bytes.
pdf.saveTo(out: Path)
Write the PDF to a file.
val pdf.isOpen: Boolean
pdf.close()
Lifecycle (Kotlin isOpen property and close()). Prefer use { }.
DocumentEditor
Mutating editor for redaction, form filling, metadata scrubbing, and incremental saves. Implements AutoCloseable. Setter methods return this for fluent chaining.
import fyi.oxide.pdf.DocumentEditor
Factory Methods
DocumentEditor.open(path: Path): DocumentEditor
DocumentEditor.open(path: String): DocumentEditor
DocumentEditor.open(bytes: ByteArray): DocumentEditor
Open a document for editing from a path or in-memory bytes.
Form Filling
editor.setFormField(name: String, value: String): DocumentEditor
Set a text/choice field value by fully qualified name.
editor.setFormField(name: String, checked: Boolean): DocumentEditor
Set a checkbox/radio field state by name.
Redaction
editor.addRedaction(pageIndex: Int, region: BBox): DocumentEditor
Queue a redaction over a rectangular region on a page.
editor.redactionCount(pageIndex: Int): Int
editor.redactionCount(): Int
Number of queued redactions on a page, or across the whole document.
editor.applyRedactionsDestructive(): RedactResult
Permanently apply all queued redactions, removing underlying content. Returns a RedactResult with the count applied and oracle-verification status.
Metadata
editor.scrubMetadata(): DocumentEditor
Strip document metadata (Info dictionary, XMP) for privacy.
Saving
editor.save(): ByteArray
editor.saveTo(out: Path)
Serialize the edited document with a full rewrite.
editor.saveIncremental(): ByteArray
editor.saveIncrementalTo(out: Path)
Serialize using an incremental update (appends changes, preserving the original bytes).
val editor.isOpen: Boolean
editor.close()
Lifecycle. Prefer use { }.
AutoExtractor
Adaptive extraction pipeline that classifies pages (text-layer vs. scanned), applies OCR where needed, and emits text / Markdown / HTML with confidence scores.
import fyi.oxide.pdf.AutoExtractor
Factory Methods
AutoExtractor.of(doc: PdfDocument): AutoExtractor
AutoExtractor.of(doc: PdfDocument, config: AutoExtractConfig): AutoExtractor
Create an extractor over a document, optionally with a custom AutoExtractConfig.
AutoExtractor.fast(doc: PdfDocument): AutoExtractor
AutoExtractor.balanced(doc: PdfDocument): AutoExtractor
AutoExtractor.highFidelity(doc: PdfDocument): AutoExtractor
Preset configurations trading speed for fidelity.
Extraction
extractor.extractText(): String
extractor.extractTextForPage(pageIndex: Int): String
Plain-text extraction for the whole document or a single page.
extractor.extractDocument(): AutoResult
extractor.extractPage(pageIndex: Int): AutoResult
Full adaptive extraction returning an AutoResult (text, optional Markdown/HTML, reason, confidence, OCR flag, regions).
extractor.extractAutoDocument(): AutoResult
extractor.extractAutoPage(pageIndex: Int): AutoResult
Auto-mode variants of the document- and page-level extraction.
extractor.extractDocumentJson(): String
extractor.extractPageJson(pageIndex: Int): String
Extraction serialized as a JSON string.
Classification
extractor.classifyDocument(): ClassifyResult
extractor.classifyPage(pageIndex: Int): ClassifyResult
Classify the document or a page, returning a ClassifyResult (per-page class plus lists of pages needing OCR, containing charts, or encrypted).
extractor.classifyPageKind(pageIndex: Int): PageClass
extractor.classifyDocumentKinds(): List<PageClass>
Get the PageClass (TEXT_LAYER / SCANNED / MIXED) for a page or all pages.
Accessors
extractor.document(): PdfDocument
extractor.config(): AutoExtractConfig
The wrapped document and the active configuration.
MarkdownConverter
Stateless, thread-safe converter from PdfDocument to Markdown or HTML.
import fyi.oxide.pdf.MarkdownConverter
MarkdownConverter.toMarkdown(doc: PdfDocument): String
MarkdownConverter.toMarkdown(doc: PdfDocument, pageIndex: Int): String
MarkdownConverter.toHtml(doc: PdfDocument): String
MarkdownConverter.toHtml(doc: PdfDocument, pageIndex: Int): String
Convert the whole document or a single page to Markdown / HTML.
PdfSigner
Digitally sign and verify PDFs with PKCS#12 keystores (PAdES B-B / B-T / B-LT levels).
import fyi.oxide.pdf.PdfSigner
PdfSigner.fromPkcs12(keystore: Path, password: String): PdfSigner
PdfSigner.fromPkcs12(keystoreBytes: ByteArray, password: String): PdfSigner
Load a signer from a PKCS#12 keystore on disk or in memory.
signer.sign(pdf: ByteArray, opts: SignOptions): ByteArray
Sign PDF bytes with the given SignOptions (level, reason, location, contact, TSA URL). Returns the signed PDF.
signer.verify(pdf: ByteArray): Boolean
Verify all signatures in a PDF. Returns true if every signature is cryptographically valid.
PdfSigner.classifyLevel(pdf: ByteArray): SignatureLevel
Static helper: detect the PAdES conformance level of an existing signed PDF.
PdfValidator
Stateless, thread-safe validation against PDF/A, PDF/X, and PDF/UA conformance levels.
import fyi.oxide.pdf.PdfValidator
PdfValidator.isPdfA(doc: PdfDocument, level: PdfALevel): Boolean
PdfValidator.isPdfUa(doc: PdfDocument, level: PdfUaLevel): Boolean
Quick boolean conformance checks.
PdfValidator.validatePdfA(doc: PdfDocument, level: PdfALevel): ValidationResult
PdfValidator.validatePdfX(doc: PdfDocument, level: PdfXLevel): ValidationResult
PdfValidator.validatePdfUa(doc: PdfDocument, level: PdfUaLevel): ValidationResult
Full validation returning a ValidationResult with the list of violations.
PdfPolicy
Global security-policy controls governing which cryptographic algorithms are permitted.
import fyi.oxide.pdf.PdfPolicy
PdfPolicy.current(): PolicyMode
PdfPolicy.set(mode: PolicyMode)
PdfPolicy.compat(): PolicyMode
PdfPolicy.strict(): PolicyMode
PdfPolicy.fipsStrict(): PolicyMode
Read or set the active PolicyMode, and obtain the built-in compat / strict / FIPS-strict modes.
Kotlin Extensions
The Kotlin facade’s only added surface: Optional<T> to T? converters and the generic orNull() helper. Import from fyi.oxide.pdf.
fun <T : Any> Optional<T>.orNull(): T?
Generic: empty Optional becomes null.
fun PdfDocument.producerOrNull(): String?
fun PdfDocument.creatorOrNull(): String?
Document /Producer and /Creator, or null if absent.
fun FormField.valueOrNull(): String?
fun FormField.bboxOrNull(): BBox?
Form-field value and widget bounding box, or null.
fun Annotation.contentsOrNull(): String?
fun Annotation.uriOrNull(): String?
Annotation /Contents and link target URI, or null.
fun AutoResult.markdownOrNull(): String?
fun AutoResult.htmlOrNull(): String?
Markdown / HTML rendering of an auto-extraction, or null if not produced.
fun ValidationViolation.pageIndexOrNull(): Int?
Page index a violation applies to, or null for document-level rules.
Geometry Types
BBox
Axis-aligned bounding box in PDF points.
BBox(x0: Double, y0: Double, x1: Double, y1: Double)
| Accessor | Type | Description |
|---|---|---|
x0(), y0(), x1(), y1() |
Double |
Corner coordinates |
width() |
Double |
x1 - x0 |
height() |
Double |
y1 - y0 |
Color
8-bit RGBA color with named constants Color.BLACK, Color.WHITE, Color.TRANSPARENT.
Color(r: Int, g: Int, b: Int, a: Int)
Color(r: Int, g: Int, b: Int) // a = 255
Accessors: r(): Int, g(): Int, b(): Int, a(): Int.
Point
Point(x: Double, y: Double)
Accessors: x(): Double, y(): Double.
Rect
Position-plus-size rectangle.
Rect(x: Double, y: Double, width: Double, height: Double)
Accessors: x(), y(), width(), height() (all Double), and toBBox(): BBox.
Text Types
TextChar
A single extracted character.
TextChar(codepoint: Int, bbox: BBox, confidence: Float)
Accessors: codepoint(): Int, bbox(): BBox, confidence(): Float, asString(): String.
TextWord
TextWord(text: String, bbox: BBox, confidence: Float)
Accessors: text(): String, bbox(): BBox, confidence(): Float.
TextLine
TextLine(text: String, bbox: BBox, words: List<TextWord>)
Accessors: text(): String, bbox(): BBox, words(): List<TextWord>.
TextSpan
A run of identically-styled text.
TextSpan(text: String, bbox: BBox, style: TextStyle)
Accessors: text(): String, bbox(): BBox, style(): TextStyle.
TextStyle
TextStyle(font: String?, size: Double, color: Color, bold: Boolean, italic: Boolean)
Accessors: font(): String?, size(): Double, color(): Color, bold(): Boolean, italic(): Boolean.
Table Types
Table
Table(bbox: BBox, rows: Int, cols: Int, cells: List<TableCell>)
Accessors: bbox(): BBox, rows(): Int, cols(): Int, cells(): List<TableCell>.
TableCell
TableCell(text: String, bbox: BBox, row: Int, col: Int, rowSpan: Int, colSpan: Int)
Accessors: text(): String, bbox(): BBox, row(): Int, col(): Int, rowSpan(): Int, colSpan(): Int.
Search Types
SearchMatch
SearchMatch(pageIndex: Int, bbox: BBox, text: String)
Accessors: pageIndex(): Int, bbox(): BBox, text(): String.
SearchResult
SearchResult(query: String, matches: List<SearchMatch>)
Accessors: query(): String, matches(): List<SearchMatch>, count(): Int, isEmpty(): Boolean.
SearchOptions
Immutable options built via a fluent builder. SearchOptions.DEFAULT is the default instance. Note: not currently wired to PdfDocument.search() — use the caseInsensitive/regex/maxResults overload above instead.
SearchOptions.builder()
.withCaseSensitive(true)
.withWholeWord(true)
.withRegex(false)
.withMaxResults(50)
.build()
Accessors: caseSensitive(): Boolean, wholeWord(): Boolean, regex(): Boolean, maxResults(): Optional<Int>. Builder methods: withCaseSensitive(Boolean), withWholeWord(Boolean), withRegex(Boolean), withMaxResults(Int) / withMaxResults(Int?), build().
Form Types
FormField
FormField(name: String, type: FormFieldType, value: String?, bbox: BBox?, pageIndex: Int)
Accessors: name(): String, type(): FormFieldType, value(): Optional<String>, bbox(): Optional<BBox>, pageIndex(): Int. Use valueOrNull() / bboxOrNull() for null-based access.
Annotation Types
Annotation
Annotation(type: AnnotationType, pageIndex: Int, bbox: BBox, contents: String?, uri: String?)
Accessors: type(): AnnotationType, pageIndex(): Int, bbox(): BBox, contents(): Optional<String>, uri(): Optional<String>. Use contentsOrNull() / uriOrNull() for null-based access.
Image Types
ExtractedImage
ExtractedImage(bytes: ByteArray, format: ImageFormat, bbox: BBox, width: Int, height: Int)
Accessors: bytes(): ByteArray, format(): ImageFormat, bbox(): BBox, width(): Int, height(): Int.
Auto-Extraction Types
AutoResult
Result of an adaptive extraction.
result.text(): String
result.markdown(): Optional<String>
result.html(): Optional<String>
result.reason(): ExtractReason
result.confidence(): Double
result.ocrUsed(): Boolean
result.regions(): List<RegionResult>
result.pagesNeedingOcr(): List<Int>
Use markdownOrNull() / htmlOrNull() for null-based access to the rendered output.
RegionResult
Per-region extraction detail within an AutoResult.
region.pageIndex(): Int
region.bbox(): BBox
region.text(): String
region.reason(): ExtractReason
region.confidence(): Double
region.ocrUsed(): Boolean
region.table(): Optional<Table>
ClassifyResult
result.pages(): List<PageClass>
result.pagesNeedingOcr(): List<Int>
result.pagesWithChart(): List<Int>
result.pagesEncrypted(): List<Int>
AutoExtractConfig
Immutable configuration built via a fluent builder; AutoExtractConfig.DEFAULT is the default. Convert an existing config back to a builder with toBuilder().
AutoExtractConfig.builder()
.withMode(ExtractMode.AUTO)
.withForceOcrPages(listOf(2, 5))
.withMinOcrConfidence(0.6)
.withOcrLanguages("eng", "deu")
.withPasswords("secret")
.withTopMarginFraction(0.05)
.withBottomMarginFraction(0.05)
.withAllowSingleColumnTables(true)
.withOcrInlineImages(false)
.withCancelToken("token-id")
.build()
Accessors return Optional<...> for each field: mode(), forceOcrPages(), minOcrConfidence(), ocrLanguages(), passwords(), topMarginFraction(), bottomMarginFraction(), allowSingleColumnTables(), ocrInlineImages(), cancelToken(). Builder setters accept both boxed-nullable and primitive overloads (e.g. withMinOcrConfidence(Double?) and withTopMarginFraction(double)), plus the withOcrLanguages(vararg String) / withPasswords(vararg String) varargs forms.
Compliance Types
ValidationResult
ValidationResult(valid: Boolean, violations: List<ValidationViolation>)
Accessors: valid(): Boolean, violations(): List<ValidationViolation>.
ValidationViolation
ValidationViolation(ruleId: String, description: String, pageIndex: Int?)
Accessors: ruleId(): String, description(): String, pageIndex(): Optional<Int>. Use pageIndexOrNull() for null-based access.
Metadata Types
DocumentInfo
DocumentInfo(/* title, author, subject, keywords, creator, producer, creationDate, modificationDate */)
Accessors all return Optional<String>: title(), author(), subject(), keywords(), creator(), producer(), creationDate(), modificationDate().
XmpMetadata
Raw XMP packet. XmpMetadata.EMPTY is the empty instance.
XmpMetadata(xml: String)
Accessors: xml(): String, isEmpty(): Boolean.
Security & Redaction Types
SecurityPolicy
Immutable policy built via a fluent builder.
SecurityPolicy.builder()
.withMode(PolicyMode.STRICT)
.allow("algorithm-id")
.deny("algorithm-id")
.build()
Accessors: mode(): PolicyMode, additionalAllow(): List<String>, additionalDeny(): List<String>. Builder methods: withMode(PolicyMode), allow(String), deny(String), build().
RedactResult
RedactResult(regionsApplied: Int, oracleVerified: Boolean)
Accessors: regionsApplied(): Int, oracleVerified(): Boolean.
Signature Types
SignOptions
Immutable signing options built via a fluent builder.
SignOptions.builder()
.withLevel(SignatureLevel.B_T)
.withReason("Approved")
.withLocation("HQ")
.withContactInfo("ops@example.com")
.withTsaUrl("https://freetsa.org/tsr")
.build()
Accessors: level(): SignatureLevel, reason(): Optional<String>, location(): Optional<String>, contactInfo(): Optional<String>, tsaUrl(): Optional<String>. Builder methods: withLevel, withReason, withLocation, withContactInfo, withTsaUrl, build().
Split Types
BookmarkSegment
BookmarkSegment(title: String, firstPage: Int, lastPage: Int, filename: String)
Accessors: title(): String, firstPage(): Int, lastPage(): Int, filename(): String.
SplitByBookmarksOptions
Immutable options built via a fluent builder.
SplitByBookmarksOptions.builder()
.withLevel(1)
.withFilenamePrefix("chapter-")
.build()
Accessors: level(): Int, filenamePrefix(): Optional<String>. Builder methods: withLevel(Int), withFilenamePrefix(String?), build().
Enums
| Enum | Values |
|---|---|
FormFieldType |
TEXT, CHECKBOX, RADIO, CHOICE |
AnnotationType |
HIGHLIGHT, TEXT, LINK, STAMP, UNDERLINE, STRIKEOUT, SQUIGGLY, FREE_TEXT, LINE, SQUARE, CIRCLE, FILE_ATTACHMENT |
ImageFormat |
JPEG, PNG, CCITT, RAW |
ExtractMode |
TEXT_ONLY, AUTO |
ExtractReason |
OK, SCANNED_NO_TEXT_LAYER, GLYPH_MAPPING_MISSING, ENCRYPTED_NO_EXTRACT_PERMISSION, IMAGE_TABLE_NO_STRUCTURE, CHART_NOT_TRANSCRIBED, OCR_REQUESTED_BUT_UNAVAILABLE, OCR_LOW_CONFIDENCE, EMPTY |
PageClass |
TEXT_LAYER, SCANNED, MIXED |
PixelFormat |
RGBA_8888, RGB_888, GRAY_8, PNG |
PolicyMode |
COMPAT, STRICT |
SignatureLevel |
B_B, B_T, B_LT |
PdfALevel |
A_1B, A_1A, A_2B, A_2A, A_2U, A_3B, A_3A, A_3U, A_4, A_4E, A_4F |
PdfXLevel |
X_1A_2001, X_1A_2003, X_3_2002, X_3_2003, X_4, X_4P, X_5G, X_5N, X_5PG, X_6, X_6P, X_6N |
PdfUaLevel |
UA_1, UA_2 (each exposes code(): Int) |
PdfErrorKind |
PARSE, ENCRYPTED, PERMISSION, IO, OCR_UNAVAILABLE, SIGNATURE, INVALID_STATE, UNSUPPORTED, OTHER |
Exceptions
All failures raise PdfException (an unchecked exception) or one of its kind-specific subclasses. The kind() accessor returns a PdfErrorKind.
import fyi.oxide.pdf.exception.PdfException
try {
PdfDocument.open(bytes).use { doc ->
println(doc.extractText(0))
}
} catch (e: PdfException) {
println("PDF error [${e.kind()}]: ${e.message}")
}
PdfException(message: String)
PdfException(kind: PdfErrorKind, message: String)
PdfException(kind: PdfErrorKind, message: String, cause: Throwable)
e.kind(): PdfErrorKind
| Exception | Cause |
|---|---|
PdfParseException |
Malformed or corrupt PDF |
PdfEncryptedException |
Encrypted document opened without a valid password |
PdfPermissionException |
Operation blocked by document permissions |
PdfIoException |
Underlying I/O failure |
PdfOcrUnavailableException |
OCR requested but the ocr feature is not built in |
PdfSignatureException |
Signing or signature-verification failure |
PdfInvalidStateException |
Operation invalid for the current handle state |
PdfUnsupportedException |
Unsupported feature or format |
Complete Example
import fyi.oxide.pdf.AutoExtractor
import fyi.oxide.pdf.DocumentEditor
import fyi.oxide.pdf.Pdf
import fyi.oxide.pdf.PdfDocument
import fyi.oxide.pdf.geometry.BBox
import fyi.oxide.pdf.producerOrNull
// --- Creation ---
val bytes = Pdf.fromMarkdown("# Report\n\nGenerated by PDF Oxide.").use { it.save() }
// --- Extraction ---
PdfDocument.open(bytes).use { doc ->
println("Pages: ${doc.pageCount()}")
println("Producer: ${doc.producerOrNull() ?: "(none)"}")
val page = doc.page(0)
println("Words: ${page.words().map { it.text() }}")
println("Tables: ${page.tables().size}")
// Search with options
val matches = doc.search("Report", caseInsensitive = true, regex = false, maxResults = 0)
matches.forEach { m -> println("p${m.pageIndex()} '${m.text()}' @ ${m.bbox()}") }
// Adaptive extraction
val result = AutoExtractor.balanced(doc).extractDocument()
println("confidence=${result.confidence()} ocr=${result.ocrUsed()}")
}
// --- Editing: redact + fill forms ---
DocumentEditor.open(bytes).use { editor ->
editor.setFormField("name", "Jane Doe")
.addRedaction(0, BBox(72.0, 700.0, 272.0, 720.0))
.scrubMetadata()
val redaction = editor.applyRedactionsDestructive()
println("Redacted ${redaction.regionsApplied()} regions")
val out: ByteArray = editor.save()
}
Other Language Bindings
PDF Oxide ships native bindings for every major ecosystem: Rust, Python, Node.js, WASM, C#, Golang, Java, PHP, Ruby, C++, Swift, Dart, R, Julia, Zig, Scala, Clojure, Objective-C, and Elixir.
Next Steps
- Types & Enums — all shared types and enums
- Page API Reference — consistent per-page iteration across bindings
- Getting Started with Kotlin — tutorial