Skip to content

JavaScript API Reference

PDF Oxide provides WebAssembly bindings for JavaScript and TypeScript. The npm package pdf-oxide-wasm works in Node.js, browsers, bundlers, Deno, and Cloudflare Workers.

npm install pdf-oxide-wasm

Multi-target packaging (v0.3.38)

pdf-oxide-wasm now ships three builds side by side with package.json conditional exports. Pick the subpath that matches your runtime — the auto-routed top-level import also resolves correctly through the exports field for most environments.

Subpath Target
pdf-oxide-wasm/nodejs Node.js (CommonJS + ESM)
pdf-oxide-wasm/bundler Vite, webpack, Rollup, esbuild, Bun
pdf-oxide-wasm/web Browsers, Deno, Cloudflare Workers
// Node.js
import { WasmPdfDocument } from "pdf-oxide-wasm/nodejs";

// Vite / webpack / Rollup
import init, { WasmPdfDocument } from "pdf-oxide-wasm/bundler";
await init();

// Browsers / Deno / Workers
import init, { WasmPdfDocument } from "pdf-oxide-wasm/web";
await init();

This fixes the ReferenceError: Can't find variable: __dirname thrown under browser bundlers prior to v0.3.38.

For the Rust API, see the Rust API Reference. For the Python API, see the Python API Reference. For type details, see Types & Enums.

Some methods are gated behind Rust build features (rendering, signatures, barcodes, ocr-tract). The default pdf-oxide-wasm package enables the common set; OCR ships in the separate wasm-ocr build. See Feature Availability.


Module Functions

Free functions exported at the top level of the package.

import {
  setLogLevel, disableLogging,
  generateBarcodeSvg, generateQrSvg,
  planSplitByBookmarks, splitByBookmarks,
  setCryptoPolicy, cryptoPolicy, cryptoInventory, cryptoCbom,
  modelManifest, prefetchAvailable,
  signPdfBytes, signPdfBytesPades, hasDocumentTimestamp,
} from "pdf-oxide-wasm";

Logging

setLogLevel(level)   // Set log verbosity: "off" | "error" | "warn" | "info" | "debug" | "trace"
disableLogging()     // Silence all log output

Barcodes

generateBarcodeSvg(barcodeType, data) -> string  // 1D barcode as SVG; type 0–7 (Code128, Code39, Ean13, Ean8, UpcA, Itf, Code93, Codabar)
generateQrSvg(data, errorCorrection, size) -> string  // QR code as SVG; errorCorrection 0=Low 1=Medium 2=Quartile 3=High

Split by Bookmarks

planSplitByBookmarks(srcBytes, titlePrefix, ignoreCase, level, includeFrontMatter) -> Array  // Plan a split without producing PDFs; returns segment descriptors
splitByBookmarks(srcBytes, titlePrefix, ignoreCase, level, includeFrontMatter) -> Array       // Split at bookmark boundaries; returns [segment, bytes] pairs (level 0=all depths, 1=top-level)

Crypto Governance

setCryptoPolicy(spec)   // Install the process-wide crypto policy ("compat" | "strict" | "fips-strict"[;…]); fail-closed
cryptoPolicy() -> string  // The active crypto policy as its canonical grammar string
cryptoInventory() -> string[]  // Algorithm tokens exercised so far this process
cryptoCbom() -> string  // CycloneDX 1.6 Cryptographic Bill of Materials (JSON string)

OCR Model Provisioning

modelManifest() -> string   // JSON manifest of OCR detector/recognizer cache filenames and source URLs (host-side fetch)
prefetchAvailable() -> boolean  // Whether this build can download OCR models to a local cache (always false in WASM)

Signing (free functions)

signPdfBytes(pdfData, cert, reason?, location?) -> Uint8Array  // Sign raw PDF bytes with a WasmCertificate; returns the signed PDF
signPdfBytesPades(pdfData, cert, level, timestampToken?, revocation?, reason?, location?) -> Uint8Array  // Sign at a PAdES baseline level (BB/BT/BLt); pass a pre-fetched RFC 3161 token for BT/BLt
hasDocumentTimestamp(pdfData) -> boolean  // Whether the PDF carries a document-scoped /DocTimeStamp (PAdES-B-LTA)

WasmPdfDocument

The primary class for opening, extracting, editing, and saving PDFs.

import { WasmPdfDocument } from "pdf-oxide-wasm";

Constructor

new WasmPdfDocument(data, password?)

Load a PDF document from raw bytes.

Parameter Type Description
data Uint8Array The PDF file contents
password string | undefined Optional password for encrypted PDFs

Throws: Error if the PDF is invalid or cannot be parsed.

const bytes = new Uint8Array(readFileSync("document.pdf"));
const doc = new WasmPdfDocument(bytes);

Static Constructors

WasmPdfDocument.openFromDocxBytes(data) -> WasmPdfDocument  // Convert DOCX bytes to a PDF document
WasmPdfDocument.openFromPptxBytes(data) -> WasmPdfDocument  // Convert PPTX bytes to a PDF document
WasmPdfDocument.openFromXlsxBytes(data) -> WasmPdfDocument  // Convert XLSX bytes to a PDF document

Core Read-Only

pageCount() -> number

Get the number of pages in the document.

version() -> Uint8Array

Get the PDF version as [major, minor].

const [major, minor] = doc.version();
console.log(`PDF ${major}.${minor}`);

authenticate(password) -> boolean

Decrypt an encrypted PDF. Returns true if authentication succeeded.

Parameter Type Description
password string The password string

hasStructureTree() -> boolean

Check if the document is a Tagged PDF with a structure tree.

Signature inspection

signatureCount() -> number          // Number of digital signatures in the document
signatures() -> WasmSignature[]     // Parsed signatures (signer, reason, time, verify())
dss() -> Dss | null                 // Document Security Store (certs/CRLs/OCSP), or null

Text Extraction

extractText(pageIndex, region?) -> string

Extract plain text from a single page. Pass an optional [x, y, w, h] region to limit extraction.

Parameter Type Description
pageIndex number Zero-based page number
region number[] | undefined Optional [x, y, width, height] clip
const text = doc.extractText(0);

extractAllText() -> string

Extract plain text from all pages, separated by form feed characters.

extractStructured(pageIndex) -> string

Extract a structured JSON representation of the page (blocks, lines, styling).

extractChars(pageIndex, region?) -> Array

Extract individual characters with precise positioning and font metadata.

Parameter Type Description
pageIndex number Zero-based page number
region number[] | undefined Optional [x, y, width, height] clip

Returns: Array of objects with fields:

Field Type Description
char string The character
bbox {x, y, width, height} Bounding box
fontName string Font name
fontSize number Font size in points
fontWeight string Weight (Normal, Bold, etc.)
isItalic boolean Italic flag
color {r, g, b} RGB color (0.0–1.0)
const chars = doc.extractChars(0);
for (const c of chars) {
  console.log(`'${c.char}' at (${c.bbox.x}, ${c.bbox.y})`);
}

extractPageText(pageIndex, readingOrder?) -> object

Get spans, characters, and page dimensions from a single extraction pass. More efficient than calling extractSpans() + extractChars() separately. Pass "column_aware" for multi-column PDFs.

Parameter Type Description
pageIndex number Zero-based page number
readingOrder string | undefined "column_aware" or "top_to_bottom" (default)

Returns: An object with fields:

Field Type Description
spans Array Array of span objects
chars Array Array of character objects
pageWidth number Page width in PDF points
pageHeight number Page height in PDF points
text string Full text content
const result = doc.extractPageText(0);
console.log(`Page: ${result.pageWidth}x${result.pageHeight} pt`);
for (const span of result.spans) {
  console.log(`'${span.text}' font=${span.fontName} size=${span.fontSize}`);
}

extractSpans(pageIndex, region?, readingOrder?) -> Array

Extract styled text spans with font metadata. Pass "column_aware" as readingOrder for multi-column PDFs.

Parameter Type Description
pageIndex number Zero-based page number
region number[] | undefined Optional [x, y, width, height] clip
readingOrder string | undefined "column_aware" or "top_to_bottom" (default)

Returns: Array of objects with fields:

Field Type Description
text string The text content
bbox {x, y, width, height} Bounding box
fontName string Font name
fontSize number Font size in points
fontWeight string Weight (Normal, Bold, etc.)
isItalic boolean Italic flag
isMonospace boolean Whether the font is fixed-width
charWidths number[] Per-glyph advance widths
color {r, g, b} RGB color (0.0–1.0)
const spans = doc.extractSpans(0);
for (const span of spans) {
  console.log(`"${span.text}" size=${span.fontSize}`);
}

Words, Lines, Tables

extractWords(pageIndex, region?) -> Array       // Word-level boxes with text + font metadata
extractTextLines(pageIndex, region?) -> Array   // Line-level boxes, each with its words
extractTables(pageIndex, region?) -> Array      // Detected tables with rows/cells (text + bboxes)

Detect and remove or erase running headers, footers, and page-furniture artifacts.

removeHeaders(threshold) -> number     // Remove detected headers across the document; returns count removed
removeFooters(threshold) -> number     // Remove detected footers; returns count removed
removeArtifacts(threshold) -> number   // Remove detected page artifacts; returns count removed
eraseHeader(pageIndex)                 // Queue an erase of the header region on a page
editHeader(pageIndex)                  // Mark the header region for editing on a page
eraseFooter(pageIndex)                 // Queue an erase of the footer region on a page
editFooter(pageIndex)                  // Mark the footer region for editing on a page
eraseArtifacts(pageIndex)              // Queue an erase of detected artifacts on a page

Region Extraction

within(pageIndex, region) -> WasmPdfPageRegion

Scope subsequent extraction to a rectangular region of a page. region is [x, y, width, height]. See WasmPdfPageRegion.

const region = doc.within(0, [50, 600, 400, 150]);
const text = region.extractText();

Format Conversion

toMarkdown(pageIndex, detectHeadings?, includeImages?, includeFormFields?) -> string

Convert a single page to Markdown.

Parameter Type Default Description
pageIndex number Zero-based page number
detectHeadings boolean true Detect headings from font size
includeImages boolean true Include images
includeFormFields boolean true Include form field values

toMarkdownAll(detectHeadings?, includeImages?, includeFormFields?) -> string

Convert all pages to Markdown.

toHtml(pageIndex, preserveLayout?, detectHeadings?, includeFormFields?) -> string

Convert a single page to HTML.

Parameter Type Default Description
pageIndex number Zero-based page number
preserveLayout boolean false Preserve visual layout
detectHeadings boolean true Detect headings
includeFormFields boolean true Include form field values

toHtmlAll(preserveLayout?, detectHeadings?, includeFormFields?) -> string

Convert all pages to HTML.

toPlainText(pageIndex) -> string

Convert a single page to plain text.

toPlainTextAll() -> string

Convert all pages to plain text.

Office round-trip

toDocxBytes() -> Uint8Array   // Export the document as a DOCX file
toPptxBytes() -> Uint8Array   // Export the document as a PPTX file
toXlsxBytes() -> Uint8Array   // Export the document as an XLSX file

search(pattern, caseInsensitive?, literal?, wholeWord?, maxResults?) -> Array

Search for text across all pages.

Parameter Type Default Description
pattern string Search pattern (string or regex)
caseInsensitive boolean false Case-insensitive search
literal boolean false Treat pattern as literal string
wholeWord boolean false Match whole words only
maxResults number 0 Maximum results (0 = unlimited)

Returns: Array of objects with fields:

Field Type Description
page number Page number
text string Matched text
bbox object Bounding box
startIndex number Start index in page text
endIndex number End index in page text

searchPage(pageIndex, pattern, caseInsensitive?, literal?, wholeWord?, maxResults?) -> Array

Search for text within a single page.


Image Info

extractImages(pageIndex) -> Array

Get image metadata for a page.

Field Type Description
width number Image width in pixels
height number Image height in pixels
colorSpace string Color space (e.g. DeviceRGB)
bitsPerComponent number Bits per color channel
bbox object Position on page

extractImageBytes(pageIndex) -> Array

Extract raw image bytes from a page. Returns an array of objects:

Field Type Description
width number Image width in pixels
height number Image height in pixels
data Uint8Array Raw image bytes
format string Image format

pageImages(pageIndex) -> Array

Get image names and bounds for positioning operations.

Field Type Description
name string XObject name
bounds number[] [x, y, width, height]
matrix number[] Transform matrix [a, b, c, d, e, f]

Vector Content

extractPaths(pageIndex, region?) -> Array   // Vector paths (lines, curves, shapes) on a page
extractRects(pageIndex, region?) -> Array   // Axis-aligned rectangles detected from path segments
extractLines(pageIndex, region?) -> Array   // Straight line segments detected from path data

Document Structure

getOutline() -> Array | null

Get document bookmarks / table of contents. Returns null if no outline exists.

getAnnotations(pageIndex) -> Array

Get annotation metadata (type, rect, contents, etc.) for a page.

pageLabels() -> Array

Get page label ranges. Returns an array of objects:

Field Type Description
startPage number First page in this range
style string Numbering style
prefix string Label prefix
startValue number Starting number

xmpMetadata() -> object | null

Get XMP metadata. Returns null if not present. Object fields include:

Field Type Description
dcTitle string | null Document title
dcCreator string[] | null Creator list
dcDescription string | null Description
xmpCreatorTool string | null Creator tool
xmpCreateDate string | null Creation date
xmpModifyDate string | null Modification date
pdfProducer string | null PDF producer

Form Fields

getFormFields() -> Array

Get all form fields with name, type, value, and flags.

Field Type Description
name string Field name
fieldType string Field type (text, checkbox, etc.)
value string Current value
flags number Field flags
const fields = doc.getFormFields();
for (const f of fields) {
  console.log(`${f.name} (${f.fieldType}) = ${f.value}`);
}

hasXfa() -> boolean

Check if the document contains XFA forms.

getFormFieldValue(name) -> any

Get a form field value by name. Returns a string, boolean, or null depending on the field type.

setFormFieldValue(name, value) -> void

Set a form field value by name.

Parameter Type Description
name string Field name
value string | boolean New field value

exportFormData(format?) -> Uint8Array

Export form data as FDF (default) or XFDF.

Parameter Type Default Description
format string "fdf" Export format: "fdf" or "xfdf"

Form flattening

flattenForms()                    // Flatten all form fields into page content
flattenFormsOnPage(pageIndex)     // Flatten forms on a specific page
flattenWarnings() -> string[]     // Warnings produced by the last flatten operation

Editing

Metadata

Method Parameters Description
setTitle(title) string Set document title
setAuthor(author) string Set document author
setSubject(subject) string Set document subject
setKeywords(keywords) string Set document keywords

Page Rotation

Method Parameters Description
pageRotation(pageIndex) number Get current rotation (0, 90, 180, 270)
setPageRotation(pageIndex, degrees) number, number Set absolute rotation
rotatePage(pageIndex, degrees) number, number Add to current rotation
rotateAllPages(degrees) number Rotate all pages

Page Dimensions

Method Parameters Description
pageMediaBox(pageIndex) number Get MediaBox [llx, lly, urx, ury]
setPageMediaBox(pageIndex, llx, lly, urx, ury) number, ... Set MediaBox
pageCropBox(pageIndex) number Get CropBox (may be null)
setPageCropBox(pageIndex, llx, lly, urx, ury) number, ... Set CropBox
cropMargins(left, right, top, bottom) number, ... Crop all page margins

Page Operations

deletePage(index)                 // Delete a page by index
movePage(fromIndex, toIndex)      // Move a page to a new position
extractPages(pages) -> Uint8Array // Build a new PDF from the given page indices

Erase / Whiteout

Method Parameters Description
eraseRegion(pageIndex, llx, lly, urx, ury) number, ... Erase a region
eraseRegions(pageIndex, rects) number, Float32Array Erase multiple regions
clearEraseRegions(pageIndex) number Clear pending erases

Annotations & Redaction

Method Parameters Description
flattenPageAnnotations(pageIndex) number Flatten annotations on page
flattenAllAnnotations() Flatten all annotations
applyPageRedactions(pageIndex) number Apply redactions on page
applyAllRedactions() Apply all redactions
addRedaction(page, x0, y0, x1, y1, fill?) number, ... Queue a redaction box (optional [r,g,b] fill)
redactionCount(page) number Count redactions queued for a page
applyRedactionsDestructive(scrubMetadata?) boolean Destructively remove content; returns a redaction report
sanitizeDocument(scrubMetadata?, removeJavascript?, removeEmbeddedFiles?) boolean, ... Strip metadata, scripts, embedded files; returns a report

Merge & Embed

mergeFrom(data) -> number

Merge pages from another PDF. Returns the number of pages merged.

Parameter Type Description
data Uint8Array The source PDF file bytes

embedFile(name, data) -> void

Attach a file to the PDF.

Parameter Type Description
name string Filename for the attachment
data Uint8Array File contents

Image Manipulation

Method Parameters Description
repositionImage(pageIndex, name, x, y) number, string, number, number Move image
resizeImage(pageIndex, name, w, h) number, string, number, number Resize image
setImageBounds(pageIndex, name, x, y, w, h) number, string, ... Set image bounds

Classification & Auto-Extraction

classifyDocument() -> string                 // Classify the whole document (e.g. born-digital vs scanned)
classifyPage(pageIndex) -> string            // Classify a single page
extractTextAuto(pageIndex) -> string         // Auto-pick native vs OCR extraction for a page
extractPageAuto(pageIndex, optionsJson?) -> string  // Auto-extraction returning a structured JSON page

Validation

validatePdfA(level) -> object        // Validate against a PDF/A conformance level (e.g. "2b")
convertToPdfA(level) -> object       // Convert toward a PDF/A level; returns a report
validatePdfUa(level?) -> object      // Validate against PDF/UA accessibility
validatePdfX(level?) -> object       // Validate against a PDF/X print level

Rendering

Requires the rendering feature.

Method Parameters Returns Description
renderPage(pageIndex, dpi?) number, number Uint8Array Render a page to PNG bytes (default 150 dpi)
flattenToImages(dpi?) number Uint8Array Flatten all pages to an image-based PDF

OCR

Requires the wasm-ocr build. See WasmOcrEngine.

extractTextOcr(pageIndex, engine) -> string

Run the in-WASM OCR pipeline on a page using a host-built WasmOcrEngine. Returns recognized text in reading order.

const text = doc.extractTextOcr(0, engine);

Save

save() -> Uint8Array

Save the edited PDF as bytes. saveToBytes() is available as an alias.

saveWithOptions(compress?, garbageCollect?, linearize?) -> Uint8Array

Save with explicit serialization options.

Parameter Type Default Description
compress boolean true Compress object streams
garbageCollect boolean true Drop unreferenced objects
linearize boolean false Produce a linearized (“fast web view”) PDF

saveEncryptedToBytes(password, ownerPassword?, allowPrint?, allowCopy?, allowModify?, allowAnnotate?) -> Uint8Array

Save with AES-256 encryption.

Parameter Type Default Description
password string User password
ownerPassword string user password Owner password
allowPrint boolean true Allow printing
allowCopy boolean true Allow copying
allowModify boolean true Allow modification
allowAnnotate boolean true Allow annotations

free()

Release WASM memory. Always call this when done with the document.


WasmPdfPageRegion

A region handle returned by WasmPdfDocument.within(pageIndex, region). Extraction methods are scoped to the rectangle.

extractText() -> string       // Plain text within the region
extractChars() -> Array       // Characters within the region
extractWords() -> Array       // Words within the region
extractTextLines() -> Array   // Text lines within the region
extractTables() -> Array      // Tables within the region
extractImages() -> Array      // Images within the region
extractPaths() -> Array       // Vector paths within the region
extractRects() -> Array       // Rectangles within the region
extractLines() -> Array       // Line segments within the region
extractTextOcr(engine?) -> string  // OCR text within the region (wasm-ocr build)

WasmPdf

Factory class for creating new PDFs.

import { WasmPdf } from "pdf-oxide-wasm";

Static Methods

WasmPdf.fromMarkdown(content, title?, author?) -> WasmPdf  // Create a PDF from Markdown text
WasmPdf.fromHtml(content, title?, author?) -> WasmPdf      // Create a PDF from HTML
WasmPdf.fromText(content, title?, author?) -> WasmPdf      // Create a PDF from plain text
WasmPdf.fromBytes(data) -> WasmPdf                         // Open an existing PDF from bytes for modification
WasmPdf.fromImageBytes(data) -> WasmPdf                    // Single-page PDF from one image (JPEG/PNG)
WasmPdf.fromMultipleImageBytes(imagesArray) -> WasmPdf     // Multi-page PDF, one page per image
WasmPdf.merge(pdfs) -> WasmPdf                             // Merge an array of PDF byte buffers into one
WasmPdf.fromHtmlCss(html, css, fontBytes) -> WasmPdf       // HTML + CSS with a single embedded font
WasmPdf.fromHtmlCssWithFonts(html, css, fonts) -> WasmPdf  // HTML + CSS with multiple [name, bytes] fonts
Parameter Type Description
content string Source content (Markdown / HTML / text)
title string | undefined Document title
author string | undefined Document author
data Uint8Array PDF or image file bytes
imagesArray Uint8Array[] Array of image file bytes
pdfs Uint8Array[] Array of PDF file bytes to merge

Instance Methods

toBytes() -> Uint8Array

Get the PDF as bytes.

size -> number

PDF size in bytes (readonly getter).

const pdf = WasmPdf.fromMarkdown("# Hello World\n\nThis is a PDF.");
console.log(`PDF size: ${pdf.size} bytes`);
writeFileSync("output.pdf", pdf.toBytes());

WasmDocumentBuilder

Fluent, low-level page-layout builder for composing PDFs page by page. Pair with WasmFluentPageBuilder.

import { WasmDocumentBuilder } from "pdf-oxide-wasm";
const builder = new WasmDocumentBuilder();

Document setup

new WasmDocumentBuilder()          // Create an empty builder
title(title)                       // Set document title
author(author)                     // Set document author
subject(subject)                   // Set document subject
keywords(keywords)                 // Set document keywords
creator(creator)                   // Set the creator tool name
onOpen(script)                     // Set a document-level open JavaScript action
taggedPdfUa1()                     // Enable Tagged PDF / PDF/UA-1 output
language(lang)                     // Set the document language (e.g. "en-US")
roleMap(custom, standard)          // Map a custom structure tag to a standard role
registerEmbeddedFont(name, font)   // Register a WasmEmbeddedFont under a name

Page creation & output

a4Page() -> WasmFluentPageBuilder         // Start a new A4 page
letterPage() -> WasmFluentPageBuilder     // Start a new US Letter page
page(width, height) -> WasmFluentPageBuilder  // Start a custom-size page (points)
commitPage(page)                          // Commit a completed page builder
build() -> Uint8Array                     // Finish and return the PDF bytes
toBytesEncrypted(userPassword, ownerPassword?) -> Uint8Array  // Finish with AES-256 encryption

WasmFluentPageBuilder

Per-page builder returned by a4Page() / letterPage() / page(). Queue operations, then commit with done(builder) (or builder.commitPage(page)).

Text & flow

font(name, size)                 // Set the current font and size
at(x, y)                         // Move the cursor to an absolute position
text(text)                       // Draw text at the cursor
heading(level, text)             // Draw a heading (level 1–6)
paragraph(text)                  // Draw a wrapped paragraph
space(points)                    // Advance the cursor vertically
horizontalRule()                 // Draw a horizontal rule
newline()                        // Advance to the next line
columns(columnCount, gapPt, text)  // Lay text out in N balanced columns
footnote(refMark, noteText)      // Add a footnote marker + bottom-of-page note

Inline runs

inline(text)                     // Append an inline text run
inlineBold(text)                 // Append a bold inline run
inlineItalic(text)               // Append an italic inline run
inlineColor(r, g, b, text)       // Append a colored inline run (RGB 0.0–1.0)
linkUrl(url)                     // Wrap the last element in a URL link
linkPage(page)                   // Link to another page index
linkNamed(destination)           // Link to a named destination
linkJavascript(script)           // Attach a JavaScript link action
onOpen(script)                   // Page open action
onClose(script)                  // Page close action
fieldKeystroke(script)           // Keystroke JavaScript for the last field
fieldFormat(script)              // Format JavaScript for the last field
fieldValidate(script)            // Validate JavaScript for the last field
fieldCalculate(script)           // Calculate JavaScript for the last field

Markup annotations

highlight(r, g, b)               // Highlight the last text run (RGB 0.0–1.0)
underline(r, g, b)               // Underline the last text run
strikeout(r, g, b)               // Strike out the last text run
squiggly(r, g, b)                // Squiggly-underline the last text run
stickyNote(text)                 // Add a sticky note at the cursor
stickyNoteAt(x, y, text)         // Add a sticky note at an absolute position
stamp(name)                      // Add a rubber-stamp annotation (e.g. "Approved")
freeText(x, y, w, h, text)       // Add a free-text annotation box
watermark(text)                  // Add a text watermark
watermarkConfidential()          // Add a "CONFIDENTIAL" watermark
watermarkDraft()                 // Add a "DRAFT" watermark

AcroForm widgets

textField(name, x, y, w, h, defaultValue?)            // Add a text field
checkbox(name, x, y, w, h, checked)                   // Add a checkbox
comboBox(name, x, y, w, h, options, selected?)        // Add a dropdown combo box
radioGroup(name, values, xs, ys, ws, hs, selected?)   // Add a radio-button group (parallel arrays)
pushButton(name, x, y, w, h, caption)                 // Add a clickable push button
signatureField(name, x, y, w, h)                      // Add an unsigned signature placeholder

Barcodes & images

barcode1d(barcodeType, data, x, y, w, h)   // Draw a 1D barcode (type 0–7)
barcodeQr(data, x, y, size)                // Draw a QR code
imageWithAlt(bytes, x, y, w, h, altText)   // Embed an image with accessibility alt text
imageArtifact(bytes, x, y, w, h)           // Embed a decorative image as an /Artifact

Graphics primitives

rect(x, y, w, h)                                  // Stroked 1pt rectangle outline
filledRect(x, y, w, h, r, g, b)                   // Filled rectangle (RGB 0.0–1.0)
line(x1, y1, x2, y2)                              // 1pt black line
strokeRect(x, y, w, h, width, r, g, b)            // Stroked rectangle, explicit width + color
strokeRectDashed(x, y, w, h, width, r, g, b, dash, phase)  // Dashed rectangle border
strokeLine(x1, y1, x2, y2, width, r, g, b)        // Line with explicit width + color
strokeLineDashed(x1, y1, x2, y2, width, r, g, b, dash, phase)  // Dashed line
textInRect(x, y, w, h, text, align)               // Lay text inside a rectangle (align 0/1/2)

Layout helpers & terminal

measure(text) -> number                  // Rendered width of text in the current font (points)
remainingSpace() -> number               // Vertical space left on the page (points)
newPageSameSize()                        // Start a new page with the same dimensions
table(spec)                              // Draw a buffered table from a spec object
streamingTable(spec) -> WasmStreamingTable  // Open a streaming table for large datasets
done(builder)                            // Commit this page's queued ops to the document builder

A table(spec) spec object uses { columns: [{ header, width, align }], rows: [[...]], hasHeader }. A streamingTable(spec) spec adds { repeatHeader, mode, sampleRows, minColWidthPt, maxColWidthPt, maxRowspan, batchSize }.


WasmStreamingTable

Row-streaming table handle returned by WasmFluentPageBuilder.streamingTable(spec). Push rows incrementally, then finish().

columnCount() -> number       // Number of columns
pendingRowCount() -> number   // Rows in the current un-flushed batch
batchCount() -> number        // Number of completed batches
pushRow(cells)                // Push one row (array of cell strings)
pushRowSpan(cells)            // Push a row whose cells may carry rowspans
flush()                       // Flush the current batch
finish()                      // Finalize the table and replay it into the page

WasmEmbeddedFont

A font registered for embedding via WasmDocumentBuilder.registerEmbeddedFont.

WasmEmbeddedFont.fromBytes(data, name?) -> WasmEmbeddedFont  // Load a TTF/OTF font from bytes
font.name -> string                                          // The font's resolved name (getter)

Page Templates

Reusable header/footer furniture applied across pages.

WasmArtifactStyle

new WasmArtifactStyle()        // Default style
font(name, size) -> this       // Set font family and size
bold() -> this                 // Make the text bold
color(r, g, b) -> this         // Set the text color (RGB 0.0–1.0)

WasmArtifact

new WasmArtifact()                       // Empty artifact
WasmArtifact.left(text) -> WasmArtifact   // Left-aligned artifact text
WasmArtifact.center(text) -> WasmArtifact // Center-aligned artifact text
WasmArtifact.right(text) -> WasmArtifact  // Right-aligned artifact text
withStyle(style) -> this                  // Apply a WasmArtifactStyle
withOffset(offset) -> this                // Set the vertical offset from the edge

WasmHeader / WasmFooter

new WasmHeader()                  // Empty header (WasmFooter is identical)
WasmHeader.left(text) -> WasmHeader     // Left-aligned header text
WasmHeader.center(text) -> WasmHeader   // Center-aligned header text
WasmHeader.right(text) -> WasmHeader    // Right-aligned header text

WasmPageTemplate

new WasmPageTemplate()         // Empty template
header(header) -> this         // Set the page header artifact
footer(footer) -> this         // Set the page footer artifact
skipFirstPage() -> this        // Omit header/footer on the first page

Digital Signatures

Requires the signatures feature.

WasmCertificate

WasmCertificate.load(data) -> WasmCertificate                  // Load a DER certificate + key bundle
WasmCertificate.loadPem(certPem, keyPem) -> WasmCertificate    // Load from PEM cert + key strings
WasmCertificate.loadPkcs12(data, password) -> WasmCertificate  // Load from a PKCS#12 (.p12/.pfx) blob
cert.subject -> string         // Subject distinguished name (getter)
cert.issuer -> string          // Issuer distinguished name (getter)
cert.serial -> string          // Serial number (getter)
cert.validity -> bigint[]      // [notBefore, notAfter] as unix seconds (getter)
cert.isValid -> boolean        // Whether the certificate is currently valid (getter)

WasmSignature

Returned by WasmPdfDocument.signatures().

sig.signerName -> string | null          // Signer common name (getter)
sig.reason -> string | null              // Signing reason (getter)
sig.location -> string | null            // Signing location (getter)
sig.contactInfo -> string | null         // Signer contact info (getter)
sig.signingTime -> bigint | null         // Signing time as unix seconds (getter)
sig.coversWholeDocument -> boolean       // Whether the signature covers the entire file (getter)
sig.padesLevel -> PadesLevel             // PAdES baseline level of the signature (getter)
sig.verify() -> boolean                  // Verify the signature cryptographically
sig.verifyDetached(pdfData) -> boolean   // Verify including a messageDigest check against the bytes

WasmTimestamp

WasmTimestamp.parse(data) -> WasmTimestamp  // Parse a DER TimeStampToken / TSTInfo
ts.time -> bigint              // Timestamp time as unix seconds (getter)
ts.serial -> string            // Serial number (getter)
ts.policyOid -> string         // TSA policy OID (getter)
ts.tsaName -> string           // TSA name (getter)
ts.hashAlgorithm -> number     // Imprint hash algorithm id (getter)
ts.messageImprint -> Uint8Array  // The message imprint digest (getter)
ts.verify() -> boolean         // Verify the timestamp token

WasmRevocationMaterial

Offline PAdES-B-LT validation material for signPdfBytesPades.

new WasmRevocationMaterial()   // Empty material set
addCert(der)                   // Add a DER X.509 certificate
addCrl(der)                    // Add a DER CRL
addOcsp(der)                   // Add a DER OCSP response

Dss

A parsed Document Security Store returned by WasmPdfDocument.dss().

dss.certCount -> number        // Number of DER certificates (getter)
getCert(i) -> Uint8Array | undefined   // i-th DER certificate
dss.crlCount -> number         // Number of DER CRLs (getter)
getCrl(i) -> Uint8Array | undefined    // i-th DER CRL
dss.ocspCount -> number        // Number of DER OCSP responses (getter)
getOcsp(i) -> Uint8Array | undefined   // i-th DER OCSP response
dss.vri -> string[]            // Per-signature VRI keys (uppercase-hex SHA-1 of /Contents) (getter)

OCR

OCR runs entirely in-WASM via the pure-Rust tract backend in the separate wasm-ocr build. Models are delivered host-side — fetch the detector/recognizer ONNX files and dictionary (see modelManifest()), then hand the bytes to the constructor.

WasmOcrEngine

new WasmOcrEngine(detModel, recModel, dict, config?)  // Build from host-supplied model bytes
engine.ocrImage(imageBytes) -> string                 // OCR a raw image (PNG/JPEG/TIFF); returns JSON {text, confidence, spans}
Parameter Type Description
detModel Uint8Array DBNet detector ONNX bytes
recModel Uint8Array SVTR recognizer ONNX bytes
dict string Recognizer character dictionary, one char per line
config WasmOcrConfig | undefined Reserved (tuned defaults are used)

WasmOcrConfig

new WasmOcrConfig()   // OCR configuration object (reserved for future tuning)

Enums

Align

Text/cell alignment discriminant used by textInRect and table column specs.

Align.Left   // 0
Align.Center // 1
Align.Right  // 2

PadesLevel

PAdES baseline level, used by signPdfBytesPades and WasmSignature.padesLevel.

PadesLevel.BB    // 0 — signed attrs incl. ESS signing-certificate-v2
PadesLevel.BT    // 1 — B-B + RFC 3161 signature-time-stamp
PadesLevel.BLt   // 2 — B-T + Document Security Store (DSS/VRI)
PadesLevel.BLta  // 3 — B-LT + document-scoped /DocTimeStamp

Feature Availability

Some features are gated behind Rust build features. The default pdf-oxide-wasm package enables the common set; OCR ships in the separate wasm-ocr build.

Feature WASM Notes
Text extraction Yes Full support
Structured extraction Yes Chars, spans, words, lines, tables
PDF creation Yes Markdown, HTML, text, images, DocumentBuilder
PDF editing Yes Metadata, rotation, dimensions, erase, pages
Form fields Yes Read, write, export, flatten, build
Search Yes Full regex support
Encryption Yes AES-256 read and write
Annotations Yes Read, flatten, redact, sanitize
Merge / split PDFs Yes Merge pages and split by bookmarks
Embedded files Yes Attach files to PDFs
Page labels / XMP Yes Read page labels and XMP metadata
Office round-trip Yes DOCX/PPTX/XLSX import and export
Validation Yes PDF/A, PDF/UA, PDF/X
Barcodes Yes (barcodes) 1D + QR as SVG or page images
Rendering Yes (rendering) Page → PNG, flatten to images
Digital signatures Yes (signatures) Sign, PAdES B-LT, verify, timestamps
OCR wasm-ocr build In-WASM tract OCR; models fetched host-side

Error Handling

All methods that can fail throw JavaScript Error objects:

try {
  const doc = new WasmPdfDocument(new Uint8Array([0, 1, 2]));
} catch (e) {
  console.error(`Failed to open: ${e.message}`);
}

TypeScript

Full type definitions are included in the package:

import { WasmPdfDocument, WasmPdf } from "pdf-oxide-wasm";

const doc: WasmPdfDocument = new WasmPdfDocument(bytes);
const text: string = doc.extractText(0);
const pdf: WasmPdf = WasmPdf.fromMarkdown("# Hello");

Other Language Bindings

PDF Oxide ships native bindings for every major ecosystem: Rust, Python, Node.js, C#, Golang, Java, PHP, Ruby, C++, Swift, Kotlin, Dart, R, Julia, Zig, Scala, Clojure, Objective-C, and Elixir.

Next Steps