JavaScript API Reference
PDF Oxide provides WebAssembly bindings for JavaScript and TypeScript. The npm package pdf-oxide-wasm works in Node.js, browsers, bundlers, Deno, and Cloudflare Workers.
npm install pdf-oxide-wasm
Multi-target packaging (v0.3.38)
pdf-oxide-wasm now ships three builds side by side with package.json conditional exports. Pick the subpath that matches your runtime — the auto-routed top-level import also resolves correctly through the exports field for most environments.
| Subpath | Target |
|---|---|
pdf-oxide-wasm/nodejs |
Node.js (CommonJS + ESM) |
pdf-oxide-wasm/bundler |
Vite, webpack, Rollup, esbuild, Bun |
pdf-oxide-wasm/web |
Browsers, Deno, Cloudflare Workers |
// Node.js
import { WasmPdfDocument } from "pdf-oxide-wasm/nodejs";
// Vite / webpack / Rollup
import init, { WasmPdfDocument } from "pdf-oxide-wasm/bundler";
await init();
// Browsers / Deno / Workers
import init, { WasmPdfDocument } from "pdf-oxide-wasm/web";
await init();
This fixes the ReferenceError: Can't find variable: __dirname thrown under browser bundlers prior to v0.3.38.
For the Rust API, see the Rust API Reference. For the Python API, see the Python API Reference. For type details, see Types & Enums.
Some methods are gated behind Rust build features (
rendering,signatures,barcodes,ocr-tract). The defaultpdf-oxide-wasmpackage enables the common set; OCR ships in the separatewasm-ocrbuild. See Feature Availability.
Module Functions
Free functions exported at the top level of the package.
import {
setLogLevel, disableLogging,
generateBarcodeSvg, generateQrSvg,
planSplitByBookmarks, splitByBookmarks,
setCryptoPolicy, cryptoPolicy, cryptoInventory, cryptoCbom,
modelManifest, prefetchAvailable,
signPdfBytes, signPdfBytesPades, hasDocumentTimestamp,
} from "pdf-oxide-wasm";
Logging
setLogLevel(level) // Set log verbosity: "off" | "error" | "warn" | "info" | "debug" | "trace"
disableLogging() // Silence all log output
Barcodes
generateBarcodeSvg(barcodeType, data) -> string // 1D barcode as SVG; type 0–7 (Code128, Code39, Ean13, Ean8, UpcA, Itf, Code93, Codabar)
generateQrSvg(data, errorCorrection, size) -> string // QR code as SVG; errorCorrection 0=Low 1=Medium 2=Quartile 3=High
Split by Bookmarks
planSplitByBookmarks(srcBytes, titlePrefix, ignoreCase, level, includeFrontMatter) -> Array // Plan a split without producing PDFs; returns segment descriptors
splitByBookmarks(srcBytes, titlePrefix, ignoreCase, level, includeFrontMatter) -> Array // Split at bookmark boundaries; returns [segment, bytes] pairs (level 0=all depths, 1=top-level)
Crypto Governance
setCryptoPolicy(spec) // Install the process-wide crypto policy ("compat" | "strict" | "fips-strict"[;…]); fail-closed
cryptoPolicy() -> string // The active crypto policy as its canonical grammar string
cryptoInventory() -> string[] // Algorithm tokens exercised so far this process
cryptoCbom() -> string // CycloneDX 1.6 Cryptographic Bill of Materials (JSON string)
OCR Model Provisioning
modelManifest() -> string // JSON manifest of OCR detector/recognizer cache filenames and source URLs (host-side fetch)
prefetchAvailable() -> boolean // Whether this build can download OCR models to a local cache (always false in WASM)
Signing (free functions)
signPdfBytes(pdfData, cert, reason?, location?) -> Uint8Array // Sign raw PDF bytes with a WasmCertificate; returns the signed PDF
signPdfBytesPades(pdfData, cert, level, timestampToken?, revocation?, reason?, location?) -> Uint8Array // Sign at a PAdES baseline level (BB/BT/BLt); pass a pre-fetched RFC 3161 token for BT/BLt
hasDocumentTimestamp(pdfData) -> boolean // Whether the PDF carries a document-scoped /DocTimeStamp (PAdES-B-LTA)
WasmPdfDocument
The primary class for opening, extracting, editing, and saving PDFs.
import { WasmPdfDocument } from "pdf-oxide-wasm";
Constructor
new WasmPdfDocument(data, password?)
Load a PDF document from raw bytes.
| Parameter | Type | Description |
|---|---|---|
data |
Uint8Array |
The PDF file contents |
password |
string | undefined |
Optional password for encrypted PDFs |
Throws: Error if the PDF is invalid or cannot be parsed.
const bytes = new Uint8Array(readFileSync("document.pdf"));
const doc = new WasmPdfDocument(bytes);
Static Constructors
WasmPdfDocument.openFromDocxBytes(data) -> WasmPdfDocument // Convert DOCX bytes to a PDF document
WasmPdfDocument.openFromPptxBytes(data) -> WasmPdfDocument // Convert PPTX bytes to a PDF document
WasmPdfDocument.openFromXlsxBytes(data) -> WasmPdfDocument // Convert XLSX bytes to a PDF document
Core Read-Only
pageCount() -> number
Get the number of pages in the document.
version() -> Uint8Array
Get the PDF version as [major, minor].
const [major, minor] = doc.version();
console.log(`PDF ${major}.${minor}`);
authenticate(password) -> boolean
Decrypt an encrypted PDF. Returns true if authentication succeeded.
| Parameter | Type | Description |
|---|---|---|
password |
string |
The password string |
hasStructureTree() -> boolean
Check if the document is a Tagged PDF with a structure tree.
Signature inspection
signatureCount() -> number // Number of digital signatures in the document
signatures() -> WasmSignature[] // Parsed signatures (signer, reason, time, verify())
dss() -> Dss | null // Document Security Store (certs/CRLs/OCSP), or null
Text Extraction
extractText(pageIndex, region?) -> string
Extract plain text from a single page. Pass an optional [x, y, w, h] region to limit extraction.
| Parameter | Type | Description |
|---|---|---|
pageIndex |
number |
Zero-based page number |
region |
number[] | undefined |
Optional [x, y, width, height] clip |
const text = doc.extractText(0);
extractAllText() -> string
Extract plain text from all pages, separated by form feed characters.
extractStructured(pageIndex) -> string
Extract a structured JSON representation of the page (blocks, lines, styling).
extractChars(pageIndex, region?) -> Array
Extract individual characters with precise positioning and font metadata.
| Parameter | Type | Description |
|---|---|---|
pageIndex |
number |
Zero-based page number |
region |
number[] | undefined |
Optional [x, y, width, height] clip |
Returns: Array of objects with fields:
| Field | Type | Description |
|---|---|---|
char |
string |
The character |
bbox |
{x, y, width, height} |
Bounding box |
fontName |
string |
Font name |
fontSize |
number |
Font size in points |
fontWeight |
string |
Weight (Normal, Bold, etc.) |
isItalic |
boolean |
Italic flag |
color |
{r, g, b} |
RGB color (0.0–1.0) |
const chars = doc.extractChars(0);
for (const c of chars) {
console.log(`'${c.char}' at (${c.bbox.x}, ${c.bbox.y})`);
}
extractPageText(pageIndex, readingOrder?) -> object
Get spans, characters, and page dimensions from a single extraction pass. More efficient than calling extractSpans() + extractChars() separately. Pass "column_aware" for multi-column PDFs.
| Parameter | Type | Description |
|---|---|---|
pageIndex |
number |
Zero-based page number |
readingOrder |
string | undefined |
"column_aware" or "top_to_bottom" (default) |
Returns: An object with fields:
| Field | Type | Description |
|---|---|---|
spans |
Array |
Array of span objects |
chars |
Array |
Array of character objects |
pageWidth |
number |
Page width in PDF points |
pageHeight |
number |
Page height in PDF points |
text |
string |
Full text content |
const result = doc.extractPageText(0);
console.log(`Page: ${result.pageWidth}x${result.pageHeight} pt`);
for (const span of result.spans) {
console.log(`'${span.text}' font=${span.fontName} size=${span.fontSize}`);
}
extractSpans(pageIndex, region?, readingOrder?) -> Array
Extract styled text spans with font metadata. Pass "column_aware" as readingOrder for multi-column PDFs.
| Parameter | Type | Description |
|---|---|---|
pageIndex |
number |
Zero-based page number |
region |
number[] | undefined |
Optional [x, y, width, height] clip |
readingOrder |
string | undefined |
"column_aware" or "top_to_bottom" (default) |
Returns: Array of objects with fields:
| Field | Type | Description |
|---|---|---|
text |
string |
The text content |
bbox |
{x, y, width, height} |
Bounding box |
fontName |
string |
Font name |
fontSize |
number |
Font size in points |
fontWeight |
string |
Weight (Normal, Bold, etc.) |
isItalic |
boolean |
Italic flag |
isMonospace |
boolean |
Whether the font is fixed-width |
charWidths |
number[] |
Per-glyph advance widths |
color |
{r, g, b} |
RGB color (0.0–1.0) |
const spans = doc.extractSpans(0);
for (const span of spans) {
console.log(`"${span.text}" size=${span.fontSize}`);
}
Words, Lines, Tables
extractWords(pageIndex, region?) -> Array // Word-level boxes with text + font metadata
extractTextLines(pageIndex, region?) -> Array // Line-level boxes, each with its words
extractTables(pageIndex, region?) -> Array // Detected tables with rows/cells (text + bboxes)
Header / Footer Artifacts
Detect and remove or erase running headers, footers, and page-furniture artifacts.
removeHeaders(threshold) -> number // Remove detected headers across the document; returns count removed
removeFooters(threshold) -> number // Remove detected footers; returns count removed
removeArtifacts(threshold) -> number // Remove detected page artifacts; returns count removed
eraseHeader(pageIndex) // Queue an erase of the header region on a page
editHeader(pageIndex) // Mark the header region for editing on a page
eraseFooter(pageIndex) // Queue an erase of the footer region on a page
editFooter(pageIndex) // Mark the footer region for editing on a page
eraseArtifacts(pageIndex) // Queue an erase of detected artifacts on a page
Region Extraction
within(pageIndex, region) -> WasmPdfPageRegion
Scope subsequent extraction to a rectangular region of a page. region is [x, y, width, height]. See WasmPdfPageRegion.
const region = doc.within(0, [50, 600, 400, 150]);
const text = region.extractText();
Format Conversion
toMarkdown(pageIndex, detectHeadings?, includeImages?, includeFormFields?) -> string
Convert a single page to Markdown.
| Parameter | Type | Default | Description |
|---|---|---|---|
pageIndex |
number |
– | Zero-based page number |
detectHeadings |
boolean |
true |
Detect headings from font size |
includeImages |
boolean |
true |
Include images |
includeFormFields |
boolean |
true |
Include form field values |
toMarkdownAll(detectHeadings?, includeImages?, includeFormFields?) -> string
Convert all pages to Markdown.
toHtml(pageIndex, preserveLayout?, detectHeadings?, includeFormFields?) -> string
Convert a single page to HTML.
| Parameter | Type | Default | Description |
|---|---|---|---|
pageIndex |
number |
– | Zero-based page number |
preserveLayout |
boolean |
false |
Preserve visual layout |
detectHeadings |
boolean |
true |
Detect headings |
includeFormFields |
boolean |
true |
Include form field values |
toHtmlAll(preserveLayout?, detectHeadings?, includeFormFields?) -> string
Convert all pages to HTML.
toPlainText(pageIndex) -> string
Convert a single page to plain text.
toPlainTextAll() -> string
Convert all pages to plain text.
Office round-trip
toDocxBytes() -> Uint8Array // Export the document as a DOCX file
toPptxBytes() -> Uint8Array // Export the document as a PPTX file
toXlsxBytes() -> Uint8Array // Export the document as an XLSX file
Search
search(pattern, caseInsensitive?, literal?, wholeWord?, maxResults?) -> Array
Search for text across all pages.
| Parameter | Type | Default | Description |
|---|---|---|---|
pattern |
string |
– | Search pattern (string or regex) |
caseInsensitive |
boolean |
false |
Case-insensitive search |
literal |
boolean |
false |
Treat pattern as literal string |
wholeWord |
boolean |
false |
Match whole words only |
maxResults |
number |
0 |
Maximum results (0 = unlimited) |
Returns: Array of objects with fields:
| Field | Type | Description |
|---|---|---|
page |
number |
Page number |
text |
string |
Matched text |
bbox |
object |
Bounding box |
startIndex |
number |
Start index in page text |
endIndex |
number |
End index in page text |
searchPage(pageIndex, pattern, caseInsensitive?, literal?, wholeWord?, maxResults?) -> Array
Search for text within a single page.
Image Info
extractImages(pageIndex) -> Array
Get image metadata for a page.
| Field | Type | Description |
|---|---|---|
width |
number |
Image width in pixels |
height |
number |
Image height in pixels |
colorSpace |
string |
Color space (e.g. DeviceRGB) |
bitsPerComponent |
number |
Bits per color channel |
bbox |
object |
Position on page |
extractImageBytes(pageIndex) -> Array
Extract raw image bytes from a page. Returns an array of objects:
| Field | Type | Description |
|---|---|---|
width |
number |
Image width in pixels |
height |
number |
Image height in pixels |
data |
Uint8Array |
Raw image bytes |
format |
string |
Image format |
pageImages(pageIndex) -> Array
Get image names and bounds for positioning operations.
| Field | Type | Description |
|---|---|---|
name |
string |
XObject name |
bounds |
number[] |
[x, y, width, height] |
matrix |
number[] |
Transform matrix [a, b, c, d, e, f] |
Vector Content
extractPaths(pageIndex, region?) -> Array // Vector paths (lines, curves, shapes) on a page
extractRects(pageIndex, region?) -> Array // Axis-aligned rectangles detected from path segments
extractLines(pageIndex, region?) -> Array // Straight line segments detected from path data
Document Structure
getOutline() -> Array | null
Get document bookmarks / table of contents. Returns null if no outline exists.
getAnnotations(pageIndex) -> Array
Get annotation metadata (type, rect, contents, etc.) for a page.
pageLabels() -> Array
Get page label ranges. Returns an array of objects:
| Field | Type | Description |
|---|---|---|
startPage |
number |
First page in this range |
style |
string |
Numbering style |
prefix |
string |
Label prefix |
startValue |
number |
Starting number |
xmpMetadata() -> object | null
Get XMP metadata. Returns null if not present. Object fields include:
| Field | Type | Description |
|---|---|---|
dcTitle |
string | null |
Document title |
dcCreator |
string[] | null |
Creator list |
dcDescription |
string | null |
Description |
xmpCreatorTool |
string | null |
Creator tool |
xmpCreateDate |
string | null |
Creation date |
xmpModifyDate |
string | null |
Modification date |
pdfProducer |
string | null |
PDF producer |
Form Fields
getFormFields() -> Array
Get all form fields with name, type, value, and flags.
| Field | Type | Description |
|---|---|---|
name |
string |
Field name |
fieldType |
string |
Field type (text, checkbox, etc.) |
value |
string |
Current value |
flags |
number |
Field flags |
const fields = doc.getFormFields();
for (const f of fields) {
console.log(`${f.name} (${f.fieldType}) = ${f.value}`);
}
hasXfa() -> boolean
Check if the document contains XFA forms.
getFormFieldValue(name) -> any
Get a form field value by name. Returns a string, boolean, or null depending on the field type.
setFormFieldValue(name, value) -> void
Set a form field value by name.
| Parameter | Type | Description |
|---|---|---|
name |
string |
Field name |
value |
string | boolean |
New field value |
exportFormData(format?) -> Uint8Array
Export form data as FDF (default) or XFDF.
| Parameter | Type | Default | Description |
|---|---|---|---|
format |
string |
"fdf" |
Export format: "fdf" or "xfdf" |
Form flattening
flattenForms() // Flatten all form fields into page content
flattenFormsOnPage(pageIndex) // Flatten forms on a specific page
flattenWarnings() -> string[] // Warnings produced by the last flatten operation
Editing
Metadata
| Method | Parameters | Description |
|---|---|---|
setTitle(title) |
string |
Set document title |
setAuthor(author) |
string |
Set document author |
setSubject(subject) |
string |
Set document subject |
setKeywords(keywords) |
string |
Set document keywords |
Page Rotation
| Method | Parameters | Description |
|---|---|---|
pageRotation(pageIndex) |
number |
Get current rotation (0, 90, 180, 270) |
setPageRotation(pageIndex, degrees) |
number, number |
Set absolute rotation |
rotatePage(pageIndex, degrees) |
number, number |
Add to current rotation |
rotateAllPages(degrees) |
number |
Rotate all pages |
Page Dimensions
| Method | Parameters | Description |
|---|---|---|
pageMediaBox(pageIndex) |
number |
Get MediaBox [llx, lly, urx, ury] |
setPageMediaBox(pageIndex, llx, lly, urx, ury) |
number, ... |
Set MediaBox |
pageCropBox(pageIndex) |
number |
Get CropBox (may be null) |
setPageCropBox(pageIndex, llx, lly, urx, ury) |
number, ... |
Set CropBox |
cropMargins(left, right, top, bottom) |
number, ... |
Crop all page margins |
Page Operations
deletePage(index) // Delete a page by index
movePage(fromIndex, toIndex) // Move a page to a new position
extractPages(pages) -> Uint8Array // Build a new PDF from the given page indices
Erase / Whiteout
| Method | Parameters | Description |
|---|---|---|
eraseRegion(pageIndex, llx, lly, urx, ury) |
number, ... |
Erase a region |
eraseRegions(pageIndex, rects) |
number, Float32Array |
Erase multiple regions |
clearEraseRegions(pageIndex) |
number |
Clear pending erases |
Annotations & Redaction
| Method | Parameters | Description |
|---|---|---|
flattenPageAnnotations(pageIndex) |
number |
Flatten annotations on page |
flattenAllAnnotations() |
– | Flatten all annotations |
applyPageRedactions(pageIndex) |
number |
Apply redactions on page |
applyAllRedactions() |
– | Apply all redactions |
addRedaction(page, x0, y0, x1, y1, fill?) |
number, ... |
Queue a redaction box (optional [r,g,b] fill) |
redactionCount(page) |
number |
Count redactions queued for a page |
applyRedactionsDestructive(scrubMetadata?) |
boolean |
Destructively remove content; returns a redaction report |
sanitizeDocument(scrubMetadata?, removeJavascript?, removeEmbeddedFiles?) |
boolean, ... |
Strip metadata, scripts, embedded files; returns a report |
Merge & Embed
mergeFrom(data) -> number
Merge pages from another PDF. Returns the number of pages merged.
| Parameter | Type | Description |
|---|---|---|
data |
Uint8Array |
The source PDF file bytes |
embedFile(name, data) -> void
Attach a file to the PDF.
| Parameter | Type | Description |
|---|---|---|
name |
string |
Filename for the attachment |
data |
Uint8Array |
File contents |
Image Manipulation
| Method | Parameters | Description |
|---|---|---|
repositionImage(pageIndex, name, x, y) |
number, string, number, number |
Move image |
resizeImage(pageIndex, name, w, h) |
number, string, number, number |
Resize image |
setImageBounds(pageIndex, name, x, y, w, h) |
number, string, ... |
Set image bounds |
Classification & Auto-Extraction
classifyDocument() -> string // Classify the whole document (e.g. born-digital vs scanned)
classifyPage(pageIndex) -> string // Classify a single page
extractTextAuto(pageIndex) -> string // Auto-pick native vs OCR extraction for a page
extractPageAuto(pageIndex, optionsJson?) -> string // Auto-extraction returning a structured JSON page
Validation
validatePdfA(level) -> object // Validate against a PDF/A conformance level (e.g. "2b")
convertToPdfA(level) -> object // Convert toward a PDF/A level; returns a report
validatePdfUa(level?) -> object // Validate against PDF/UA accessibility
validatePdfX(level?) -> object // Validate against a PDF/X print level
Rendering
Requires the rendering feature.
| Method | Parameters | Returns | Description |
|---|---|---|---|
renderPage(pageIndex, dpi?) |
number, number |
Uint8Array |
Render a page to PNG bytes (default 150 dpi) |
flattenToImages(dpi?) |
number |
Uint8Array |
Flatten all pages to an image-based PDF |
OCR
Requires the wasm-ocr build. See WasmOcrEngine.
extractTextOcr(pageIndex, engine) -> string
Run the in-WASM OCR pipeline on a page using a host-built WasmOcrEngine. Returns recognized text in reading order.
const text = doc.extractTextOcr(0, engine);
Save
save() -> Uint8Array
Save the edited PDF as bytes. saveToBytes() is available as an alias.
saveWithOptions(compress?, garbageCollect?, linearize?) -> Uint8Array
Save with explicit serialization options.
| Parameter | Type | Default | Description |
|---|---|---|---|
compress |
boolean |
true |
Compress object streams |
garbageCollect |
boolean |
true |
Drop unreferenced objects |
linearize |
boolean |
false |
Produce a linearized (“fast web view”) PDF |
saveEncryptedToBytes(password, ownerPassword?, allowPrint?, allowCopy?, allowModify?, allowAnnotate?) -> Uint8Array
Save with AES-256 encryption.
| Parameter | Type | Default | Description |
|---|---|---|---|
password |
string |
– | User password |
ownerPassword |
string |
user password | Owner password |
allowPrint |
boolean |
true |
Allow printing |
allowCopy |
boolean |
true |
Allow copying |
allowModify |
boolean |
true |
Allow modification |
allowAnnotate |
boolean |
true |
Allow annotations |
free()
Release WASM memory. Always call this when done with the document.
WasmPdfPageRegion
A region handle returned by WasmPdfDocument.within(pageIndex, region). Extraction methods are scoped to the rectangle.
extractText() -> string // Plain text within the region
extractChars() -> Array // Characters within the region
extractWords() -> Array // Words within the region
extractTextLines() -> Array // Text lines within the region
extractTables() -> Array // Tables within the region
extractImages() -> Array // Images within the region
extractPaths() -> Array // Vector paths within the region
extractRects() -> Array // Rectangles within the region
extractLines() -> Array // Line segments within the region
extractTextOcr(engine?) -> string // OCR text within the region (wasm-ocr build)
WasmPdf
Factory class for creating new PDFs.
import { WasmPdf } from "pdf-oxide-wasm";
Static Methods
WasmPdf.fromMarkdown(content, title?, author?) -> WasmPdf // Create a PDF from Markdown text
WasmPdf.fromHtml(content, title?, author?) -> WasmPdf // Create a PDF from HTML
WasmPdf.fromText(content, title?, author?) -> WasmPdf // Create a PDF from plain text
WasmPdf.fromBytes(data) -> WasmPdf // Open an existing PDF from bytes for modification
WasmPdf.fromImageBytes(data) -> WasmPdf // Single-page PDF from one image (JPEG/PNG)
WasmPdf.fromMultipleImageBytes(imagesArray) -> WasmPdf // Multi-page PDF, one page per image
WasmPdf.merge(pdfs) -> WasmPdf // Merge an array of PDF byte buffers into one
WasmPdf.fromHtmlCss(html, css, fontBytes) -> WasmPdf // HTML + CSS with a single embedded font
WasmPdf.fromHtmlCssWithFonts(html, css, fonts) -> WasmPdf // HTML + CSS with multiple [name, bytes] fonts
| Parameter | Type | Description |
|---|---|---|
content |
string |
Source content (Markdown / HTML / text) |
title |
string | undefined |
Document title |
author |
string | undefined |
Document author |
data |
Uint8Array |
PDF or image file bytes |
imagesArray |
Uint8Array[] |
Array of image file bytes |
pdfs |
Uint8Array[] |
Array of PDF file bytes to merge |
Instance Methods
toBytes() -> Uint8Array
Get the PDF as bytes.
size -> number
PDF size in bytes (readonly getter).
const pdf = WasmPdf.fromMarkdown("# Hello World\n\nThis is a PDF.");
console.log(`PDF size: ${pdf.size} bytes`);
writeFileSync("output.pdf", pdf.toBytes());
WasmDocumentBuilder
Fluent, low-level page-layout builder for composing PDFs page by page. Pair with WasmFluentPageBuilder.
import { WasmDocumentBuilder } from "pdf-oxide-wasm";
const builder = new WasmDocumentBuilder();
Document setup
new WasmDocumentBuilder() // Create an empty builder
title(title) // Set document title
author(author) // Set document author
subject(subject) // Set document subject
keywords(keywords) // Set document keywords
creator(creator) // Set the creator tool name
onOpen(script) // Set a document-level open JavaScript action
taggedPdfUa1() // Enable Tagged PDF / PDF/UA-1 output
language(lang) // Set the document language (e.g. "en-US")
roleMap(custom, standard) // Map a custom structure tag to a standard role
registerEmbeddedFont(name, font) // Register a WasmEmbeddedFont under a name
Page creation & output
a4Page() -> WasmFluentPageBuilder // Start a new A4 page
letterPage() -> WasmFluentPageBuilder // Start a new US Letter page
page(width, height) -> WasmFluentPageBuilder // Start a custom-size page (points)
commitPage(page) // Commit a completed page builder
build() -> Uint8Array // Finish and return the PDF bytes
toBytesEncrypted(userPassword, ownerPassword?) -> Uint8Array // Finish with AES-256 encryption
WasmFluentPageBuilder
Per-page builder returned by a4Page() / letterPage() / page(). Queue operations, then commit with done(builder) (or builder.commitPage(page)).
Text & flow
font(name, size) // Set the current font and size
at(x, y) // Move the cursor to an absolute position
text(text) // Draw text at the cursor
heading(level, text) // Draw a heading (level 1–6)
paragraph(text) // Draw a wrapped paragraph
space(points) // Advance the cursor vertically
horizontalRule() // Draw a horizontal rule
newline() // Advance to the next line
columns(columnCount, gapPt, text) // Lay text out in N balanced columns
footnote(refMark, noteText) // Add a footnote marker + bottom-of-page note
Inline runs
inline(text) // Append an inline text run
inlineBold(text) // Append a bold inline run
inlineItalic(text) // Append an italic inline run
inlineColor(r, g, b, text) // Append a colored inline run (RGB 0.0–1.0)
Link & form actions
linkUrl(url) // Wrap the last element in a URL link
linkPage(page) // Link to another page index
linkNamed(destination) // Link to a named destination
linkJavascript(script) // Attach a JavaScript link action
onOpen(script) // Page open action
onClose(script) // Page close action
fieldKeystroke(script) // Keystroke JavaScript for the last field
fieldFormat(script) // Format JavaScript for the last field
fieldValidate(script) // Validate JavaScript for the last field
fieldCalculate(script) // Calculate JavaScript for the last field
Markup annotations
highlight(r, g, b) // Highlight the last text run (RGB 0.0–1.0)
underline(r, g, b) // Underline the last text run
strikeout(r, g, b) // Strike out the last text run
squiggly(r, g, b) // Squiggly-underline the last text run
stickyNote(text) // Add a sticky note at the cursor
stickyNoteAt(x, y, text) // Add a sticky note at an absolute position
stamp(name) // Add a rubber-stamp annotation (e.g. "Approved")
freeText(x, y, w, h, text) // Add a free-text annotation box
watermark(text) // Add a text watermark
watermarkConfidential() // Add a "CONFIDENTIAL" watermark
watermarkDraft() // Add a "DRAFT" watermark
AcroForm widgets
textField(name, x, y, w, h, defaultValue?) // Add a text field
checkbox(name, x, y, w, h, checked) // Add a checkbox
comboBox(name, x, y, w, h, options, selected?) // Add a dropdown combo box
radioGroup(name, values, xs, ys, ws, hs, selected?) // Add a radio-button group (parallel arrays)
pushButton(name, x, y, w, h, caption) // Add a clickable push button
signatureField(name, x, y, w, h) // Add an unsigned signature placeholder
Barcodes & images
barcode1d(barcodeType, data, x, y, w, h) // Draw a 1D barcode (type 0–7)
barcodeQr(data, x, y, size) // Draw a QR code
imageWithAlt(bytes, x, y, w, h, altText) // Embed an image with accessibility alt text
imageArtifact(bytes, x, y, w, h) // Embed a decorative image as an /Artifact
Graphics primitives
rect(x, y, w, h) // Stroked 1pt rectangle outline
filledRect(x, y, w, h, r, g, b) // Filled rectangle (RGB 0.0–1.0)
line(x1, y1, x2, y2) // 1pt black line
strokeRect(x, y, w, h, width, r, g, b) // Stroked rectangle, explicit width + color
strokeRectDashed(x, y, w, h, width, r, g, b, dash, phase) // Dashed rectangle border
strokeLine(x1, y1, x2, y2, width, r, g, b) // Line with explicit width + color
strokeLineDashed(x1, y1, x2, y2, width, r, g, b, dash, phase) // Dashed line
textInRect(x, y, w, h, text, align) // Lay text inside a rectangle (align 0/1/2)
Layout helpers & terminal
measure(text) -> number // Rendered width of text in the current font (points)
remainingSpace() -> number // Vertical space left on the page (points)
newPageSameSize() // Start a new page with the same dimensions
table(spec) // Draw a buffered table from a spec object
streamingTable(spec) -> WasmStreamingTable // Open a streaming table for large datasets
done(builder) // Commit this page's queued ops to the document builder
A table(spec) spec object uses { columns: [{ header, width, align }], rows: [[...]], hasHeader }. A streamingTable(spec) spec adds { repeatHeader, mode, sampleRows, minColWidthPt, maxColWidthPt, maxRowspan, batchSize }.
WasmStreamingTable
Row-streaming table handle returned by WasmFluentPageBuilder.streamingTable(spec). Push rows incrementally, then finish().
columnCount() -> number // Number of columns
pendingRowCount() -> number // Rows in the current un-flushed batch
batchCount() -> number // Number of completed batches
pushRow(cells) // Push one row (array of cell strings)
pushRowSpan(cells) // Push a row whose cells may carry rowspans
flush() // Flush the current batch
finish() // Finalize the table and replay it into the page
WasmEmbeddedFont
A font registered for embedding via WasmDocumentBuilder.registerEmbeddedFont.
WasmEmbeddedFont.fromBytes(data, name?) -> WasmEmbeddedFont // Load a TTF/OTF font from bytes
font.name -> string // The font's resolved name (getter)
Page Templates
Reusable header/footer furniture applied across pages.
WasmArtifactStyle
new WasmArtifactStyle() // Default style
font(name, size) -> this // Set font family and size
bold() -> this // Make the text bold
color(r, g, b) -> this // Set the text color (RGB 0.0–1.0)
WasmArtifact
new WasmArtifact() // Empty artifact
WasmArtifact.left(text) -> WasmArtifact // Left-aligned artifact text
WasmArtifact.center(text) -> WasmArtifact // Center-aligned artifact text
WasmArtifact.right(text) -> WasmArtifact // Right-aligned artifact text
withStyle(style) -> this // Apply a WasmArtifactStyle
withOffset(offset) -> this // Set the vertical offset from the edge
WasmHeader / WasmFooter
new WasmHeader() // Empty header (WasmFooter is identical)
WasmHeader.left(text) -> WasmHeader // Left-aligned header text
WasmHeader.center(text) -> WasmHeader // Center-aligned header text
WasmHeader.right(text) -> WasmHeader // Right-aligned header text
WasmPageTemplate
new WasmPageTemplate() // Empty template
header(header) -> this // Set the page header artifact
footer(footer) -> this // Set the page footer artifact
skipFirstPage() -> this // Omit header/footer on the first page
Digital Signatures
Requires the signatures feature.
WasmCertificate
WasmCertificate.load(data) -> WasmCertificate // Load a DER certificate + key bundle
WasmCertificate.loadPem(certPem, keyPem) -> WasmCertificate // Load from PEM cert + key strings
WasmCertificate.loadPkcs12(data, password) -> WasmCertificate // Load from a PKCS#12 (.p12/.pfx) blob
cert.subject -> string // Subject distinguished name (getter)
cert.issuer -> string // Issuer distinguished name (getter)
cert.serial -> string // Serial number (getter)
cert.validity -> bigint[] // [notBefore, notAfter] as unix seconds (getter)
cert.isValid -> boolean // Whether the certificate is currently valid (getter)
WasmSignature
Returned by WasmPdfDocument.signatures().
sig.signerName -> string | null // Signer common name (getter)
sig.reason -> string | null // Signing reason (getter)
sig.location -> string | null // Signing location (getter)
sig.contactInfo -> string | null // Signer contact info (getter)
sig.signingTime -> bigint | null // Signing time as unix seconds (getter)
sig.coversWholeDocument -> boolean // Whether the signature covers the entire file (getter)
sig.padesLevel -> PadesLevel // PAdES baseline level of the signature (getter)
sig.verify() -> boolean // Verify the signature cryptographically
sig.verifyDetached(pdfData) -> boolean // Verify including a messageDigest check against the bytes
WasmTimestamp
WasmTimestamp.parse(data) -> WasmTimestamp // Parse a DER TimeStampToken / TSTInfo
ts.time -> bigint // Timestamp time as unix seconds (getter)
ts.serial -> string // Serial number (getter)
ts.policyOid -> string // TSA policy OID (getter)
ts.tsaName -> string // TSA name (getter)
ts.hashAlgorithm -> number // Imprint hash algorithm id (getter)
ts.messageImprint -> Uint8Array // The message imprint digest (getter)
ts.verify() -> boolean // Verify the timestamp token
WasmRevocationMaterial
Offline PAdES-B-LT validation material for signPdfBytesPades.
new WasmRevocationMaterial() // Empty material set
addCert(der) // Add a DER X.509 certificate
addCrl(der) // Add a DER CRL
addOcsp(der) // Add a DER OCSP response
Dss
A parsed Document Security Store returned by WasmPdfDocument.dss().
dss.certCount -> number // Number of DER certificates (getter)
getCert(i) -> Uint8Array | undefined // i-th DER certificate
dss.crlCount -> number // Number of DER CRLs (getter)
getCrl(i) -> Uint8Array | undefined // i-th DER CRL
dss.ocspCount -> number // Number of DER OCSP responses (getter)
getOcsp(i) -> Uint8Array | undefined // i-th DER OCSP response
dss.vri -> string[] // Per-signature VRI keys (uppercase-hex SHA-1 of /Contents) (getter)
OCR
OCR runs entirely in-WASM via the pure-Rust tract backend in the separate wasm-ocr build. Models are delivered host-side — fetch the detector/recognizer ONNX files and dictionary (see modelManifest()), then hand the bytes to the constructor.
WasmOcrEngine
new WasmOcrEngine(detModel, recModel, dict, config?) // Build from host-supplied model bytes
engine.ocrImage(imageBytes) -> string // OCR a raw image (PNG/JPEG/TIFF); returns JSON {text, confidence, spans}
| Parameter | Type | Description |
|---|---|---|
detModel |
Uint8Array |
DBNet detector ONNX bytes |
recModel |
Uint8Array |
SVTR recognizer ONNX bytes |
dict |
string |
Recognizer character dictionary, one char per line |
config |
WasmOcrConfig | undefined |
Reserved (tuned defaults are used) |
WasmOcrConfig
new WasmOcrConfig() // OCR configuration object (reserved for future tuning)
Enums
Align
Text/cell alignment discriminant used by textInRect and table column specs.
Align.Left // 0
Align.Center // 1
Align.Right // 2
PadesLevel
PAdES baseline level, used by signPdfBytesPades and WasmSignature.padesLevel.
PadesLevel.BB // 0 — signed attrs incl. ESS signing-certificate-v2
PadesLevel.BT // 1 — B-B + RFC 3161 signature-time-stamp
PadesLevel.BLt // 2 — B-T + Document Security Store (DSS/VRI)
PadesLevel.BLta // 3 — B-LT + document-scoped /DocTimeStamp
Feature Availability
Some features are gated behind Rust build features. The default pdf-oxide-wasm package enables the common set; OCR ships in the separate wasm-ocr build.
| Feature | WASM | Notes |
|---|---|---|
| Text extraction | Yes | Full support |
| Structured extraction | Yes | Chars, spans, words, lines, tables |
| PDF creation | Yes | Markdown, HTML, text, images, DocumentBuilder |
| PDF editing | Yes | Metadata, rotation, dimensions, erase, pages |
| Form fields | Yes | Read, write, export, flatten, build |
| Search | Yes | Full regex support |
| Encryption | Yes | AES-256 read and write |
| Annotations | Yes | Read, flatten, redact, sanitize |
| Merge / split PDFs | Yes | Merge pages and split by bookmarks |
| Embedded files | Yes | Attach files to PDFs |
| Page labels / XMP | Yes | Read page labels and XMP metadata |
| Office round-trip | Yes | DOCX/PPTX/XLSX import and export |
| Validation | Yes | PDF/A, PDF/UA, PDF/X |
| Barcodes | Yes (barcodes) |
1D + QR as SVG or page images |
| Rendering | Yes (rendering) |
Page → PNG, flatten to images |
| Digital signatures | Yes (signatures) |
Sign, PAdES B-LT, verify, timestamps |
| OCR | wasm-ocr build |
In-WASM tract OCR; models fetched host-side |
Error Handling
All methods that can fail throw JavaScript Error objects:
try {
const doc = new WasmPdfDocument(new Uint8Array([0, 1, 2]));
} catch (e) {
console.error(`Failed to open: ${e.message}`);
}
TypeScript
Full type definitions are included in the package:
import { WasmPdfDocument, WasmPdf } from "pdf-oxide-wasm";
const doc: WasmPdfDocument = new WasmPdfDocument(bytes);
const text: string = doc.extractText(0);
const pdf: WasmPdf = WasmPdf.fromMarkdown("# Hello");
Other Language Bindings
PDF Oxide ships native bindings for every major ecosystem: Rust, Python, Node.js, C#, Golang, Java, PHP, Ruby, C++, Swift, Kotlin, Dart, R, Julia, Zig, Scala, Clojure, Objective-C, and Elixir.
Next Steps
- Types & Enums — all shared types and enums
- Page API Reference — consistent per-page iteration across bindings
- Getting Started with WASM — tutorial