Skip to content

Julia API Reference

PDF Oxide provides idiomatic Julia bindings (PdfOxide.jl) layered directly over the pdf_oxide C ABI via ccall — no shim crate. Native handles are wrapped in mutable structs with finalizers, C strings and buffers are copied into Julia and freed automatically, and any non-success C-ABI status throws a PdfOxideError. Page indices are 0-based.

using Pkg
Pkg.add("PdfOxide")

The native libpdf_oxide cdylib is loaded at runtime and resolved via PDF_OXIDE_LIB_PATHPDF_OXIDE_LIB_DIR../target/releasetarget/release → the system loader. Build it with the binding feature set:

cargo build --release --lib --features ocr,rendering,signatures,barcodes,tsa-client,system-fonts
using PdfOxide

pdf = from_markdown("# Hello\n\nbody\n")
doc = open_from_bytes(to_bytes(pdf))
page_count(doc)
extract_text(doc, 0)        # 0-based page index
to_markdown_all(doc)
close!(doc)

For the Rust API, see the Rust API Reference. For the Python API, see the Python API Reference. For type details, see Types & Enums.

Most handle types (PdfDocument, Pdf, DocumentEditor, DocumentBuilder, RenderedImage, Certificate, SignatureInfo, Timestamp, TsaClient, Dss, validation results, Barcode, OcrEngine, Renderer, ElementList, EmbeddedFont, PageBuilder) own native memory. Free them eagerly with close!(x) — it is idempotent and also runs at finalization.


PdfDocument

The read-side handle for opening, extracting, searching, rendering, validating, and inspecting an existing PDF.

Opening

open_document(path::AbstractString; password::Union{Nothing,AbstractString} = nothing) -> PdfDocument
open_from_bytes(data::AbstractVector{UInt8}) -> PdfDocument
open_with_password(path::AbstractString, password::AbstractString) -> PdfDocument
Function Description
open_document(path; password=nothing) Open a PDF from a filesystem path (optionally password-protected).
open_from_bytes(data) Open a PDF from an in-memory byte vector.
open_with_password(path, password) Open an encrypted PDF on disk with a password.

Document Info

Function Returns Description
page_count(d) Int Number of pages.
version(d) PdfVersion PDF version (major/minor).
is_encrypted(d) Bool Whether the document is encrypted.
has_structure_tree(d) Bool Whether the document is a Tagged PDF with a structure tree.
has_xfa(d) Bool Whether the document carries an XFA form.
authenticate(d, password) Bool Authenticate against an encrypted document’s password (a wrong password returns false, not an error).

Whole-Document Conversion

Function Returns Description
to_markdown_all(d) String Markdown for the whole document.
to_html_all(d) String HTML for the whole document.
to_plain_text_all(d) String Plain text for the whole document.
extract_all_text(d) String Whole-document auto text extraction.
classify_document(d) String Classify the whole document; returns the classifier’s JSON string.

Per-Page Text Extraction

extract_text(d::PdfDocument, page::Integer) -> String
to_plain_text(d::PdfDocument, page::Integer) -> String
to_markdown(d::PdfDocument, page::Integer) -> String
to_html(d::PdfDocument, page::Integer) -> String
extract_structured_json(d::PdfDocument, page::Integer) -> String
Function Description
extract_text(d, page) Extract plain text from a 0-based page.
to_plain_text(d, page) Render a page to layout-aware plain text.
to_markdown(d, page) Render a page to Markdown.
to_html(d, page) Render a page to HTML.
extract_structured_json(d, page) Extract a structured-document JSON model for a page.
extract_text_auto(d, page) Auto-pick the best text extraction for a page.
extract_page_auto(d, page, options="{}") Auto page extraction with a JSON options string.
classify_page(d, page) Classify a page; returns the classifier’s JSON string.

Structured Element Extraction

extract_chars(d::PdfDocument, page::Integer)      -> Vector{Char}
extract_words(d::PdfDocument, page::Integer)      -> Vector{Word}
extract_text_lines(d::PdfDocument, page::Integer) -> Vector{TextLine}
extract_tables(d::PdfDocument, page::Integer)     -> Vector{Table}
embedded_fonts(d::PdfDocument, page::Integer)     -> Vector{Font}
embedded_images(d::PdfDocument, page::Integer)    -> Vector{Image}
page_annotations(d::PdfDocument, page::Integer)   -> Vector{Annotation}
extract_paths(d::PdfDocument, page::Integer)      -> Vector{Path}
Function Description
extract_chars(d, page) Extract glyphs from a page as a Vector{Char}.
extract_words(d, page) Extract words from a page as a Vector{Word}.
extract_text_lines(d, page) Extract text lines as a Vector{TextLine}.
extract_tables(d, page) Extract tables as a Vector{Table}.
embedded_fonts(d, page) Embedded fonts as a Vector{Font}.
embedded_images(d, page) Embedded images as a Vector{Image}.
page_annotations(d, page) Annotations as a Vector{Annotation}.
extract_paths(d, page) Vector paths as a Vector{Path}.

Region-Constrained Extraction

extract_text_in_rect(d::PdfDocument, page::Integer, x, y, w, h)   -> String
extract_words_in_rect(d::PdfDocument, page::Integer, x, y, w, h)  -> Vector{Word}
extract_lines_in_rect(d::PdfDocument, page::Integer, x, y, w, h)  -> Vector{TextLine}
extract_tables_in_rect(d::PdfDocument, page::Integer, x, y, w, h) -> Vector{Table}
extract_images_in_rect(d::PdfDocument, page::Integer, x, y, w, h) -> Vector{Image}

Each restricts extraction to the rectangle (x, y, w, h) in PDF user-space points on a 0-based page.

search(d::PdfDocument, page::Integer, term::AbstractString, caseSensitive::Bool) -> Vector{SearchResult}
search_all(d::PdfDocument, term::AbstractString, caseSensitive::Bool)             -> Vector{SearchResult}
search_results_to_json(d::PdfDocument, term::AbstractString, caseSensitive::Bool) -> String
Function Description
search(d, page, term, caseSensitive) Search a single page for term.
search_all(d, term, caseSensitive) Search the whole document for term.
search_results_to_json(d, term, caseSensitive) Serialize whole-document search results to JSON.

Page Geometry

Function Returns Description
page_get_width(d, page) Float64 Page width (PDF points).
page_get_height(d, page) Float64 Page height (PDF points).
page_get_rotation(d, page) Int Absolute rotation (degrees) of a page.
page(d, index) PdfPage A 0-based page view over the document (see PdfPage).

Document Structure & Metadata

Function Returns Description
get_outline(d) String Bookmarks / table of contents (JSON).
get_page_labels(d) String Page-label ranges (JSON).
get_xmp_metadata(d) String XMP metadata (JSON).
get_source_bytes(d) Vector{UInt8} The document’s original source bytes.
plan_split_by_bookmarks(d, options="{}") String Plan a split-by-bookmarks operation (JSON).

Annotation Inspection

annotations_to_json(d::PdfDocument, page::Integer) -> String
annotation_get_color(d::PdfDocument, page::Integer, index::Integer) -> UInt32
annotation_creation_date(d::PdfDocument, page::Integer, index::Integer) -> String
annotation_modification_date(d::PdfDocument, page::Integer, index::Integer) -> String
Function Returns Description
annotations_to_json(d, page) String All annotations on a page as JSON.
annotation_get_color(d, page, index) UInt32 32-bit packed RGBA color of an annotation.
annotation_creation_date(d, page, index) String Annotation creation date.
annotation_modification_date(d, page, index) String Annotation modification date.
annotation_is_hidden(d, page, index) Bool Whether the annotation is hidden.
annotation_is_marked_deleted(d, page, index) Bool Whether the annotation is marked deleted.
annotation_is_printable(d, page, index) Bool Whether the annotation is printable.
annotation_is_read_only(d, page, index) Bool Whether the annotation is read-only.
link_annotation_uri(d, page, index) String URI of a Link annotation.
text_annotation_icon_name(d, page, index) String Icon name of a Text (sticky-note) annotation.
highlight_quad_points_count(d, page, index) Int Quad-point count of a Highlight annotation.
highlight_quad_point(d, page, index, quad_index) NTuple{8,Float64} The quad_index-th quad (8 floats).

Font & Element JSON Helpers

Function Returns Description
fonts_to_json(d, page) String Embedded fonts on a page as JSON.
font_size(d, page, index) Float64 Font size of the index-th embedded font.
page_get_elements(d, page) ElementList Page layout-region elements as an ElementList.

Form Fields

get_form_fields(d::PdfDocument) -> Vector{FormField}
form_field_count(d::PdfDocument) -> Int
export_form_data_to_bytes(d::PdfDocument, format_type::Integer) -> Vector{UInt8}
import_form_data(d::PdfDocument, path::AbstractString) -> Int
form_import_from_file(d::PdfDocument, filename::AbstractString) -> Bool
Function Description
get_form_fields(d) All AcroForm fields as a Vector{FormField}.
form_field_count(d) Convenience field count.
export_form_data_to_bytes(d, format_type) Export form data (format_type selects FDF/XFDF).
import_form_data(d, path) Import form data from a file; returns the C status code.
form_import_from_file(d, filename) Import form data; returns true on success.

OCR

page_needs_ocr(d::PdfDocument, page::Integer) -> Bool
ocr_extract_text(d::PdfDocument, page::Integer, engine::Union{Nothing,OcrEngine} = nothing) -> String
Function Description
page_needs_ocr(d, page) Whether a page is scanned/hybrid and needs OCR.
ocr_extract_text(d, page, engine=nothing) Extract text via OCR (nothing falls back to native extraction only). See OcrEngine.

Signatures (document-level)

sign(d::PdfDocument, cert::Certificate; reason::AbstractString = "", location::AbstractString = "") -> Int
get_signature_count(d::PdfDocument) -> Int
get_signature(d::PdfDocument, index::Integer) -> SignatureInfo
verify_all_signatures(d::PdfDocument) -> Int
has_timestamp(d::PdfDocument) -> Int
document_get_dss(d::PdfDocument) -> Union{Dss,Nothing}
Function Description
sign(d, cert; reason, location) Sign the document with cert; returns a status code.
get_signature_count(d) Number of signatures in the document.
get_signature(d, index) The index-th signature as a SignatureInfo.
verify_all_signatures(d) Verify all signatures; returns a status code.
has_timestamp(d) Whether the document carries a document-level timestamp.
document_get_dss(d) Read the document /DSS into a Dss, or nothing if none.

Validation & Conversion

validate_pdf_a(d::PdfDocument, level::Integer) -> PdfAResults
validate_pdf_ua(d::PdfDocument, level::Integer) -> UaResults
validate_pdf_x(d::PdfDocument, level::Integer) -> PdfXResults
document_convert_to_pdf_a(d::PdfDocument, level::Integer) -> Bool

validatePdfA / validatePdfUa / validatePdfX are camelCase aliases of the three validators. See Validation Results.

Office Conversion

to_docx(d::PdfDocument) -> Vector{UInt8}
to_pptx(d::PdfDocument) -> Vector{UInt8}
to_xlsx(d::PdfDocument) -> Vector{UInt8}

Convert the PDF back to a DOCX / PPTX / XLSX byte buffer. Opening Office files as PDFs is done with open_from_docx_bytes / open_from_pptx_bytes / open_from_xlsx_bytes (see Office Input).

Lifecycle

Function Description
close!(d) Free the native handle now (idempotent; also runs at finalization).

PdfPage

A lightweight 0-based page view returned by page(doc, index). Methods delegate to the parent document.

p = page(doc, 0)
text(p)                       # -> String
markdown(p)                   # -> String
extract_words(p)              # -> Vector{Word}
search(p, "term", false)      # -> Vector{SearchResult}
render_page(p)                # -> RenderedImage
Method Returns Description
text(p) String Plain text for the page.
markdown(p) String Markdown for the page.
html(p) String HTML for the page.
plain_text(p) String Layout-aware plain text.
extract_chars(p) Vector{Char} Glyphs.
extract_words(p) Vector{Word} Words.
extract_text_lines(p) Vector{TextLine} Text lines.
extract_tables(p) Vector{Table} Tables.
embedded_fonts(p) Vector{Font} Embedded fonts.
embedded_images(p) Vector{Image} Embedded images.
page_annotations(p) Vector{Annotation} Annotations.
extract_paths(p) Vector{Path} Vector paths.
search(p, term, caseSensitive) Vector{SearchResult} Search the page.
render_page(p, format=0) RenderedImage Render the page (0=PNG).
render_page_zoom(p, zoom, format=0) RenderedImage Render at a zoom factor.
render_page_thumbnail(p, size, format=0) RenderedImage Render a thumbnail fitting size px.

Rendering

Render PdfDocument pages to images. format: 0=PNG (default), 1=JPEG. Coordinates are in PDF user-space points.

render_page(d::PdfDocument, pageIndex::Integer, format::Integer = 0) -> RenderedImage
render_page_zoom(d::PdfDocument, pageIndex::Integer, zoom::Real, format::Integer = 0) -> RenderedImage
render_page_thumbnail(d::PdfDocument, pageIndex::Integer, size::Integer, format::Integer = 0) -> RenderedImage
render_page_region(d::PdfDocument, page::Integer, crop_x, crop_y, crop_width, crop_height, format::Integer = 0) -> RenderedImage
render_page_fit(d::PdfDocument, page::Integer, w::Integer, h::Integer, format::Integer = 0) -> RenderedImage
render_page_raw(d::PdfDocument, page::Integer, dpi::Integer) -> Tuple{RenderedImage,Int,Int}
render_page_with_options(d::PdfDocument, page::Integer, dpi, format, bg_r, bg_g, bg_b, bg_a, transparent_background, render_annotations, jpeg_quality) -> RenderedImage
render_page_with_options_ex(d::PdfDocument, page::Integer, dpi, format, bg_r, bg_g, bg_b, bg_a, transparent_background, render_annotations, jpeg_quality, excluded_layers::AbstractVector{<:AbstractString} = String[]) -> RenderedImage
Function Description
render_page(d, page, format=0) Render a page (0=PNG).
render_page_zoom(d, page, zoom, format=0) Render at a zoom factor.
render_page_thumbnail(d, page, size, format=0) Render a thumbnail fitting size px.
render_page_region(d, page, x, y, w, h, format=0) Render a rectangular crop (user-space points).
render_page_fit(d, page, w, h, format=0) Render to fit inside w×h px, preserving aspect ratio.
render_page_raw(d, page, dpi) Render to a raw premultiplied RGBA8888 buffer; returns (image, width, height).
render_page_with_options(d, page, dpi, format, bg_r, bg_g, bg_b, bg_a, transparent_background, render_annotations, jpeg_quality) Render with the full RenderOptions surface (background channels 0–1; flags are 0/1).
render_page_with_options_ex(...; excluded_layers) As above plus OCG layer filtering — suppress the named /Name layers.
estimate_render_time(d, page) Estimate render time (ms) for a page.

renderPage / renderPageZoom / renderPageThumbnail are camelCase aliases of render_page / render_page_zoom / render_page_thumbnail.

Renderer

A reusable renderer handle.

create_renderer(dpi::Integer = 150, format::Integer = 0, quality::Integer = 90, anti_alias::Bool = true) -> Renderer
close!(r::Renderer)

RenderedImage

The result of a render call. Fields: width::Int, height::Int, data::Vector{UInt8} (encoded bytes, or raw RGBA for render_page_raw).

save(img::RenderedImage, path::AbstractString)   # write to disk (format inferred)
close!(img::RenderedImage)

DocumentEditor

The edit-side handle for mutating and re-saving an existing PDF.

Opening & Source

open_editor(path::AbstractString) -> DocumentEditor
open_editor_from_bytes(data::AbstractVector{UInt8}) -> DocumentEditor
Function Returns Description
is_modified(e) Bool Whether the editor has unsaved modifications.
get_source_path(e) String Source path the editor was opened from.
page_count(e) Int Number of pages.
version(e) PdfVersion PDF version.

Info / Metadata

Function Description
get_producer(e) Producer from /Info.Producer.
set_producer(e, value) Set /Info.Producer.
get_creation_date(e) Creation date from /Info.CreationDate (raw PDF date string).
set_creation_date(e, date_str) Set /Info.CreationDate (raw PDF date string).

Saving

save(e::DocumentEditor, path::AbstractString)
save_to_bytes(e::DocumentEditor) -> Vector{UInt8}
save_to_bytes_with_options(e::DocumentEditor, compress::Bool, garbage_collect::Bool, linearize::Bool) -> Vector{UInt8}
extract_pages_to_bytes(e::DocumentEditor, pages::AbstractVector{<:Integer}) -> Vector{UInt8}
save_encrypted(e::DocumentEditor, path::AbstractString, user_password::AbstractString, owner_password::AbstractString)
save_encrypted_to_bytes(e::DocumentEditor, user_password::AbstractString, owner_password::AbstractString) -> Vector{UInt8}
convert_to_pdf_a(e::DocumentEditor, level::Integer)
Function Description
save(e, path) Save the edited document to a path.
save_to_bytes(e) Serialize the edited document to bytes.
save_to_bytes_with_options(e, compress, garbage_collect, linearize) Serialize with compression / GC / linearization options.
extract_pages_to_bytes(e, pages) Extract a subset of pages to a new in-memory PDF.
save_encrypted(e, path, user_pw, owner_pw) Save with AES-256 encryption to a path.
save_encrypted_to_bytes(e, user_pw, owner_pw) Save with AES-256 encryption to bytes.
convert_to_pdf_a(e, level) Convert to PDF/A in-place (level: 0=A1b…7=A3u).

Merge & Attachments

Function Description
merge_from(e, source_path) Merge pages from a PDF on disk.
merge_from_bytes(e, data) Merge pages from an in-memory PDF buffer.
embed_file(e, name, data) Embed a file attachment (name, data bytes).

Page Operations

rotate_all_pages(e::DocumentEditor, degrees::Integer)
rotate_page_by(e::DocumentEditor, page::Integer, degrees::Integer)
get_page_rotation(e::DocumentEditor, page::Integer) -> Int
set_page_rotation(e::DocumentEditor, page::Integer, degrees::Integer)
delete_page(e::DocumentEditor, page::Integer)
move_page(e::DocumentEditor, from::Integer, to::Integer)
Function Description
rotate_all_pages(e, degrees) Rotate all pages (relative).
rotate_page_by(e, page, degrees) Rotate a page additively.
get_page_rotation(e, page) Absolute rotation of a page.
set_page_rotation(e, page, degrees) Set absolute rotation of a page.
delete_page(e, page) Delete a page.
move_page(e, from, to) Move a page from one index to another.

Page Boxes & Cropping

get_page_media_box(e::DocumentEditor, page::Integer) -> NTuple{4,Float64}
get_page_crop_box(e::DocumentEditor, page::Integer) -> NTuple{4,Float64}
set_page_media_box(e::DocumentEditor, page::Integer, x, y, w, h)
set_page_crop_box(e::DocumentEditor, page::Integer, x, y, w, h)
crop_margins(e::DocumentEditor, left::Real, right::Real, top::Real, bottom::Real)
Function Description
get_page_media_box(e, page) Get the MediaBox of a page.
get_page_crop_box(e, page) Get the CropBox of a page.
set_page_media_box(e, page, x, y, w, h) Set the MediaBox of a page.
set_page_crop_box(e, page, x, y, w, h) Set the CropBox of a page.
crop_margins(e, left, right, top, bottom) Crop all pages by margins (user-space).

Erase / Whiteout

erase_region(e::DocumentEditor, page::Integer, x, y, w, h)
erase_regions(e::DocumentEditor, page::Integer, rects::AbstractVector{<:NTuple{4,<:Real}})
clear_erase_regions(e::DocumentEditor, page::Integer)
Function Description
erase_region(e, page, x, y, w, h) Erase one rectangular region.
erase_regions(e, page, rects) Erase multiple regions (rects is a vector of (x, y, w, h) tuples).
clear_erase_regions(e, page) Clear all pending erase-region entries for a page.
erase_header(d::PdfDocument, page::Integer)
erase_footer(d::PdfDocument, page::Integer)
erase_artifacts(d::PdfDocument, page::Integer)
remove_headers(d::PdfDocument, threshold::Real = 0.5)
remove_footers(d::PdfDocument, threshold::Real = 0.5)
remove_artifacts(d::PdfDocument, threshold::Real = 0.5)
Function Description
erase_header(d, page) / erase_footer(d, page) / erase_artifacts(d, page) Erase the detected header / footer / artifacts on a page.
remove_headers(d, threshold=0.5) / remove_footers(...) / remove_artifacts(...) Remove repeating headers / footers / artifacts across the document above a frequency threshold.

Annotation Flatten

flatten_annotations(e::DocumentEditor, page::Integer)
flatten_all_annotations(e::DocumentEditor)
is_page_marked_for_flatten(e::DocumentEditor, page::Integer) -> Bool
unmark_page_for_flatten(e::DocumentEditor, page::Integer)

Redaction

apply_page_redactions(e::DocumentEditor, page::Integer)
apply_all_redactions(e::DocumentEditor)
is_page_marked_for_redaction(e::DocumentEditor, page::Integer) -> Bool
unmark_page_for_redaction(e::DocumentEditor, page::Integer)
redaction_add(e::DocumentEditor, page::Integer, x1, y1, x2, y2, r, g, b)
redaction_count(e::DocumentEditor, page::Integer) -> Int
redaction_apply(e::DocumentEditor, scrub_metadata::Bool, r::Real, g::Real, b::Real) -> Int
redaction_scrub_metadata(e::DocumentEditor) -> Int
Function Description
apply_page_redactions(e, page) Apply (burn in) redactions on a page.
apply_all_redactions(e) Apply all pending redactions.
is_page_marked_for_redaction(e, page) Whether a page is marked for redaction.
unmark_page_for_redaction(e, page) Remove the redaction mark from a page.
redaction_add(e, page, x1, y1, x2, y2, r, g, b) Queue a redaction box with overlay color (DeviceRGB, 0–1).
redaction_count(e, page) Number of queued redaction regions on a page.
redaction_apply(e, scrub_metadata, r, g, b) Destructively apply all queued redactions; returns glyphs purged.
redaction_scrub_metadata(e) Strip Info/XMP/JavaScript/EmbeddedFiles; returns count removed.

Form Filling & Flatten

set_form_field_value(e::DocumentEditor, name::AbstractString, value::AbstractString)
flatten_forms(e::DocumentEditor)
flatten_forms_on_page(e::DocumentEditor, page::Integer)
flatten_warnings_count(e::DocumentEditor) -> Int
flatten_warning(e::DocumentEditor, index::Integer) -> String
import_fdf_bytes(e::DocumentEditor, data::AbstractVector{UInt8})
import_xfdf_bytes(e::DocumentEditor, data::AbstractVector{UInt8})
Function Description
set_form_field_value(e, name, value) Set a form field value (UTF-8).
flatten_forms(e) Flatten all forms (bake values into page content).
flatten_forms_on_page(e, page) Flatten forms on a single page.
flatten_warnings_count(e) Number of warnings from the last form-flatten save.
flatten_warning(e, index) The index-th flatten warning string.
import_fdf_bytes(e, data) Import FDF form data from bytes.
import_xfdf_bytes(e, data) Import XFDF form data from bytes.

Barcode Stamping

add_barcode_to_page(e::DocumentEditor, page::Integer, b::Barcode, x, y, width, height)

Stamp a Barcode onto a page at rect (x, y, width, height). See Barcodes.

Lifecycle

close!(e::DocumentEditor) frees the handle.


Pdf

The lightweight create-side handle returned by the from_* factories.

Factories

from_markdown(input::AbstractString) -> Pdf
from_html(input::AbstractString) -> Pdf
from_text(input::AbstractString) -> Pdf
from_image(path::AbstractString) -> Pdf
from_image_bytes(data::AbstractVector{UInt8}) -> Pdf
from_html_css(html::AbstractString, css::AbstractString, font_bytes::Union{Nothing,AbstractVector{UInt8}} = nothing) -> Pdf
from_html_css_with_fonts(html::AbstractString, css::AbstractString, families::AbstractVector{<:AbstractString}, fonts::AbstractVector{<:AbstractVector{UInt8}}) -> Pdf
Function Description
from_markdown(input) Build a Pdf from Markdown.
from_html(input) Build a Pdf from HTML.
from_text(input) Build a Pdf from plain text.
from_image(path) Build a Pdf from an image file.
from_image_bytes(data) Build a Pdf from in-memory image bytes.
from_html_css(html, css, font_bytes=nothing) Build from HTML + CSS with one optional embedded font.
from_html_css_with_fonts(html, css, families, fonts) Build from HTML + CSS with a multi-font cascade (families[i] names fonts[i]).

Methods

Function Returns Description
save(p, path) Write the built PDF to a path.
to_bytes(p) Vector{UInt8} Serialize the built PDF to bytes.
get_page_count(p) Int Page count of a builder Pdf.
close!(p) Free the handle.

merge_pdfs

merge_pdfs(paths::AbstractVector{<:AbstractString}) -> Vector{UInt8}

Merge the PDFs at paths (in order) into a single PDF byte buffer.

Office Input

open_from_docx_bytes(data::AbstractVector{UInt8}) -> Pdf
open_from_pptx_bytes(data::AbstractVector{UInt8}) -> Pdf
open_from_xlsx_bytes(data::AbstractVector{UInt8}) -> Pdf

Convert DOCX / PPTX / XLSX bytes into a Pdf.


DocumentBuilder

A fluent, structure-aware PDF builder. Create pages with a4_page / letter_page / page, lay out content on the returned PageBuilder, call done to commit each page, then build / save / encrypted variants.

DocumentBuilder() -> DocumentBuilder

Document-Level Configuration

set_title(b::DocumentBuilder, value::AbstractString)
set_author(b::DocumentBuilder, value::AbstractString)
set_subject(b::DocumentBuilder, value::AbstractString)
set_keywords(b::DocumentBuilder, value::AbstractString)
set_creator(b::DocumentBuilder, value::AbstractString)
on_open(b::DocumentBuilder, value::AbstractString)
language(b::DocumentBuilder, value::AbstractString)
tagged_pdf_ua1(b::DocumentBuilder)
role_map(b::DocumentBuilder, custom::AbstractString, standard::AbstractString)
register_embedded_font(b::DocumentBuilder, name::AbstractString, f::EmbeddedFont)
Function Description
set_title / set_author / set_subject / set_keywords / set_creator Set the corresponding /Info metadata field.
on_open(b, value) Set a document-open JavaScript action.
language(b, value) Set the document language (e.g. "en-US").
tagged_pdf_ua1(b) Enable PDF/UA-1 tagged-PDF mode.
role_map(b, custom, standard) Map a custom structure type to a standard one.
register_embedded_font(b, name, f) Register a loaded EmbeddedFont under name (consumes the font).

Pages

a4_page(b::DocumentBuilder) -> PageBuilder
letter_page(b::DocumentBuilder) -> PageBuilder
page(b::DocumentBuilder, width::Real, height::Real) -> PageBuilder

Output

build(b::DocumentBuilder) -> Vector{UInt8}
save(b::DocumentBuilder, path::AbstractString)
save_encrypted_builder(b::DocumentBuilder, path::AbstractString, user_password::AbstractString, owner_password::AbstractString)
to_bytes_encrypted(b::DocumentBuilder, user_password::AbstractString, owner_password::AbstractString) -> Vector{UInt8}
Function Description
build(b) Build and return the PDF bytes (the builder must still be closed).
save(b, path) Build and save to a path.
save_encrypted_builder(b, path, user_pw, owner_pw) Build and save with AES-256 encryption.
to_bytes_encrypted(b, user_pw, owner_pw) Build encrypted bytes (AES-256).

EmbeddedFont

embedded_font_from_file(path::AbstractString) -> EmbeddedFont
embedded_font_from_bytes(data::AbstractVector{UInt8}; name::Union{Nothing,AbstractString} = nothing) -> EmbeddedFont
Function Description
embedded_font_from_file(path) Load a TTF/OTF font from a path.
embedded_font_from_bytes(data; name) Load a font from bytes (name may be empty to use the PostScript name).

PageBuilder

Returned by a4_page / letter_page / page. All methods mutate the page in place; call done(p) to commit it to the parent builder (this consumes the page handle).

Text & Layout

font(p::PageBuilder, name::AbstractString, size::Real)
at(p::PageBuilder, x::Real, y::Real)
heading(p::PageBuilder, level::Integer, text::AbstractString)
Method Description
font(p, name, size) Set the font + size for subsequent text.
at(p, x, y) Move the cursor to absolute (x, y) (points, from lower-left).
heading(p, level, text) Emit a heading at level (1–6).

The following string-valued methods all share the signature f(p::PageBuilder, value::AbstractString):

Method Description
text(p, value) Emit a run of body text.
paragraph(p, value) Emit a wrapped paragraph.
link_url(p, value) Make the previous text a URL link.
link_named(p, value) Link the previous text to a named destination.
link_javascript(p, value) Attach a JavaScript action to the previous text.
on_open(p, value) Set a page-open JavaScript action.
on_close(p, value) Set a page-close JavaScript action.
field_keystroke(p, value) / field_format(p, value) / field_validate(p, value) / field_calculate(p, value) Attach AcroForm field JavaScript actions.
sticky_note(p, value) Attach a sticky note to the previous content.
watermark(p, value) Add a text watermark.
stamp(p, value) Add a stamp annotation.
inline(p, value) Append an inline text run.
inline_bold(p, value) Append a bold inline run.
inline_italic(p, value) Append an italic inline run.

Zero-argument layout methods, all f(p::PageBuilder):

Method Description
horizontal_rule(p) Draw a horizontal rule.
space(p) Insert vertical space.
newline(p) Break to a new line.
new_page_same_size(p) Start a new page with the same dimensions.
watermark_confidential(p) Add a “CONFIDENTIAL” watermark.
watermark_draft(p) Add a “DRAFT” watermark.
link_page(p::PageBuilder, page_index::Integer)
sticky_note_at(p::PageBuilder, x::Real, y::Real, text::AbstractString)
freetext(p::PageBuilder, x::Real, y::Real, w::Real, h::Real, text::AbstractString)
footnote(p::PageBuilder, ref_mark::AbstractString, note_text::AbstractString)
columns(p::PageBuilder, column_count::Integer, gap_pt::Real, text::AbstractString)
inline_color(p::PageBuilder, r::Real, g::Real, b_::Real, text::AbstractString)
Method Description
link_page(p, page_index) Link the previous text to an internal page index.
sticky_note_at(p, x, y, text) Place a free-standing sticky note.
freetext(p, x, y, w, h, text) Place a free-flowing text annotation in a rect.
footnote(p, ref_mark, note_text) Add a footnote (inline superscript + body at page end).
columns(p, column_count, gap_pt, text) Lay out text across balanced columns.
inline_color(p, r, g, b_, text) Append an RGB-colored inline run (channels 0–1).

Markup Annotations

All four share f(p::PageBuilder, r::Real, g::Real, b_::Real) and apply to the previous text with an RGB color:

Method Description
highlight(p, r, g, b) Highlight the previous text.
underline(p, r, g, b) Underline the previous text.
strikeout(p, r, g, b) Strike out the previous text.
squiggly(p, r, g, b) Squiggly-underline the previous text.

Form Widgets

text_field(p::PageBuilder, name, x, y, w, h; default_value::Union{Nothing,AbstractString} = nothing)
checkbox(p::PageBuilder, name, x, y, w, h, checked::Bool)
combo_box(p::PageBuilder, name, x, y, w, h, options::AbstractVector{<:AbstractString}; selected::Union{Nothing,AbstractString} = nothing)
radio_group(p::PageBuilder, name, values, xs, ys, ws, hs; selected::Union{Nothing,AbstractString} = nothing)
push_button(p::PageBuilder, name, x, y, w, h, caption::AbstractString)
signature_field(p::PageBuilder, name, x, y, w, h)
Method Description
text_field(p, name, x, y, w, h; default_value) Add a single-line text field.
checkbox(p, name, x, y, w, h, checked) Add a checkbox with an initial ticked state.
combo_box(p, name, x, y, w, h, options; selected) Add a dropdown combo box.
radio_group(p, name, values, xs, ys, ws, hs; selected) Add a radio group (parallel arrays describe each button).
push_button(p, name, x, y, w, h, caption) Add a clickable push button.
signature_field(p, name, x, y, w, h) Add an unsigned signature placeholder.

Images & Barcodes

image(p::PageBuilder, bytes::AbstractVector{UInt8}, x, y, w, h)
image_with_alt(p::PageBuilder, bytes::AbstractVector{UInt8}, x, y, w, h, alt_text::AbstractString)
image_artifact(p::PageBuilder, bytes::AbstractVector{UInt8}, x, y, w, h)
barcode_1d(p::PageBuilder, barcode_type::Integer, data::AbstractString, x, y, w, h)
barcode_qr(p::PageBuilder, data::AbstractString, x, y, size::Real)
Method Description
image(p, bytes, x, y, w, h) Embed an image (raw JPEG/PNG/WebP).
image_with_alt(p, bytes, x, y, w, h, alt_text) Embed an image with accessibility alt text.
image_artifact(p, bytes, x, y, w, h) Embed a decorative image as an /Artifact.
barcode_1d(p, barcode_type, data, x, y, w, h) Place a 1-D barcode (barcode_type: 0=Code128…7=Codabar).
barcode_qr(p, data, x, y, size) Place a square QR code.

Vector Graphics

rect(p::PageBuilder, x, y, w, h)
filled_rect(p::PageBuilder, x, y, w, h, r, g, b_)
line(p::PageBuilder, x1, y1, x2, y2)
stroke_rect(p::PageBuilder, x, y, w, h, width, r, g, b_)
stroke_line(p::PageBuilder, x1, y1, x2, y2, width, r, g, b_)
stroke_rect_dashed(p::PageBuilder, x, y, w, h, width, r, g, b_, dash_array::AbstractVector{<:Real}, phase::Real)
stroke_line_dashed(p::PageBuilder, x1, y1, x2, y2, width, r, g, b_, dash_array::AbstractVector{<:Real}, phase::Real)
text_in_rect(p::PageBuilder, x, y, w, h, text::AbstractString, align::Integer)
Method Description
rect(p, x, y, w, h) Stroked 1pt black rectangle outline.
filled_rect(p, x, y, w, h, r, g, b) Filled rectangle in RGB (0–1).
line(p, x1, y1, x2, y2) 1pt black line.
stroke_rect(p, x, y, w, h, width, r, g, b) Stroked rectangle, widthpt RGB.
stroke_line(p, x1, y1, x2, y2, width, r, g, b) Stroked line, widthpt RGB.
stroke_rect_dashed(...) Dashed stroked rectangle (dash_array on/off lengths, phase offset).
stroke_line_dashed(...) Dashed stroked line.
text_in_rect(p, x, y, w, h, text, align) Draw text inside a rect (align: 0=Left, 1=Center, 2=Right).

Tables

table(p::PageBuilder, n_columns::Integer, widths::AbstractVector{<:Real}, aligns::AbstractVector{<:Integer}, n_rows::Integer, cell_strings::AbstractMatrix{<:AbstractString}, has_header::Bool)
streaming_table_begin(p::PageBuilder, headers::AbstractVector{<:AbstractString}, widths::AbstractVector{<:Real}, aligns::AbstractVector{<:Integer}, repeat_header::Bool)
streaming_table_begin_v2(p::PageBuilder, headers, widths, aligns, repeat_header::Bool, mode::Integer, sample_rows::Integer, min_col_width_pt::Real, max_col_width_pt::Real, max_rowspan::Integer)
streaming_table_set_batch_size(p::PageBuilder, batch_size::Integer)
streaming_table_pending_row_count(p::PageBuilder) -> Int
streaming_table_batch_count(p::PageBuilder) -> Int
streaming_table_push_row(p::PageBuilder, cells::AbstractVector{<:AbstractString})
streaming_table_push_row_v2(p::PageBuilder, cells::AbstractVector{<:AbstractString}, rowspans::AbstractVector{<:Integer})
streaming_table_flush(p::PageBuilder)
streaming_table_finish(p::PageBuilder)
Method Description
table(p, n_columns, widths, aligns, n_rows, cell_strings, has_header) Emit a buffered table (aligns: 0=Left/1=Center/2=Right; cell_strings is row-major n_rows × n_columns).
streaming_table_begin(p, headers, widths, aligns, repeat_header) Open a streaming table (parallel length-n_columns arrays).
streaming_table_begin_v2(...) Open a streaming table with width mode (0=Fixed, 1=Sample, 2=AutoAll) and max_rowspan.
streaming_table_set_batch_size(p, batch_size) Set the batch size (0 → 256).
streaming_table_pending_row_count(p) Rows pushed since the last batch boundary.
streaming_table_batch_count(p) Number of complete batches so far.
streaming_table_push_row(p, cells) Push one row (rowspan=1).
streaming_table_push_row_v2(p, cells, rowspans) Push one row with per-cell rowspans (≥2 spans).
streaming_table_flush(p) Flush the current batch.
streaming_table_finish(p) Finish the streaming table.

Commit

done(p::PageBuilder)   # commit this page's buffered operations to its parent builder (consumes the handle)

Value Types

Immutable structs returned by extraction. Bbox has fields x, y, width, height (PDF user-space units).

Type Fields
Bbox x, y, width, height (Float64)
Char character::UInt32, bbox::Bbox, font_name::String, font_size::Float64
Word text, bbox, font_name, font_size, bold
TextLine text, bbox, word_count
Table row_count, col_count, has_header, cells (use cell(t, row, col) for a 0-based cell)
Font name, type, encoding, embedded, subset
Image width, height, bitsPerComponent, format, colorspace, data
Annotation type, subtype, content, author, rect::Bbox, borderWidth
Path bbox::Bbox, strokeWidth, hasStroke, hasFill, operationCount
SearchResult text, page, bbox::Bbox
FormField name, value, type, readonly, required
PdfVersion major::Int, minor::Int
cell(t::Table, row::Integer, col::Integer) -> String

FormField accessors

form_field_name(f::FormField) -> String
form_field_value(f::FormField) -> String
form_field_type(f::FormField) -> String
form_field_is_readonly(f::FormField) -> Bool
form_field_is_required(f::FormField) -> Bool

ElementList

element_count(l::ElementList) -> Int
element_type(l::ElementList, index::Integer) -> String
element_text(l::ElementList, index::Integer) -> String
element_rect(l::ElementList, index::Integer) -> Bbox
elements_to_json(l::ElementList) -> String
close!(l::ElementList)

Digital Signatures

Certificate

certificate_load_from_bytes(data::AbstractVector{UInt8}, password::AbstractString = "") -> Certificate
certificate_load_from_pem(cert_pem::AbstractString, key_pem::AbstractString) -> Certificate
certificate_get_subject(c::Certificate) -> String
certificate_get_issuer(c::Certificate) -> String
certificate_get_serial(c::Certificate) -> String
certificate_get_validity(c::Certificate) -> Tuple{Int,Int}
certificate_is_valid(c::Certificate) -> Bool
Function Description
certificate_load_from_bytes(data, password="") Load signing credentials from PKCS#12 / PFX bytes.
certificate_load_from_pem(cert_pem, key_pem) Load from PEM-encoded certificate + private key.
certificate_get_subject(c) / certificate_get_issuer(c) / certificate_get_serial(c) Subject / issuer / serial strings.
certificate_get_validity(c) Validity window as (not_before, not_after) Unix epoch seconds.
certificate_is_valid(c) Whether the certificate is currently valid.

Signing

sign_bytes(pdf::AbstractVector{UInt8}, cert::Certificate, reason::AbstractString, location::AbstractString) -> Vector{UInt8}
sign_bytes_pades(pdf::AbstractVector{UInt8}, cert::Certificate, level::Integer, tsa_url::Union{Nothing,AbstractString}, reason::AbstractString, location::AbstractString; certs = Vector{UInt8}[], crls = Vector{UInt8}[], ocsps = Vector{UInt8}[]) -> Vector{UInt8}
sign_bytes_pades_opts(pdf::AbstractVector{UInt8}, cert::Certificate, level::Integer, tsa_url, reason, location; certs = Vector{UInt8}[], crls = Vector{UInt8}[], ocsps = Vector{UInt8}[]) -> Vector{UInt8}
add_timestamp(pdf_data::AbstractVector{UInt8}, sig_index::Integer, tsa_url::AbstractString) -> Vector{UInt8}
Function Description
sign_bytes(pdf, cert, reason, location) Sign raw PDF bytes; returns the signed PDF.
sign_bytes_pades(pdf, cert, level, tsa_url, reason, location; certs, crls, ocsps) PAdES baseline signing (level: 0=B-B, 1=B-T, 2=B-LT; tsa_url required for level ≥ 1).
sign_bytes_pades_opts(...) Struct-options variant of sign_bytes_pades (identical behavior, builds PadesSignOptionsC).
add_timestamp(pdf_data, sig_index, tsa_url) Add an RFC 3161 timestamp to a signature; returns the timestamped PDF.

SignatureInfo

signature_get_signer_name(s::SignatureInfo) -> String
signature_get_signing_reason(s::SignatureInfo) -> String
signature_get_signing_location(s::SignatureInfo) -> String
signature_get_signing_time(s::SignatureInfo) -> Int
signature_get_certificate(s::SignatureInfo) -> Certificate
signature_get_pades_level(s::SignatureInfo) -> Int
signature_has_timestamp(s::SignatureInfo) -> Bool
signature_get_timestamp(s::SignatureInfo) -> Timestamp
signature_add_timestamp(s::SignatureInfo, ts) -> Bool
signature_verify(s::SignatureInfo) -> Int
signature_verify_detached(s::SignatureInfo, pdf::AbstractVector{UInt8}) -> Int
Function Description
signature_get_signer_name(s) / _reason(s) / _location(s) Signer name / reason / location.
signature_get_signing_time(s) Signing time (Unix epoch seconds).
signature_get_certificate(s) Signer’s Certificate (owned).
signature_get_pades_level(s) PAdES level code.
signature_has_timestamp(s) Whether an embedded RFC 3161 timestamp is present.
signature_get_timestamp(s) The embedded Timestamp (owned).
signature_add_timestamp(s, ts) Attach a Timestamp; returns true on success.
signature_verify(s) Signer-attributes crypto check (1=valid, 0=invalid, -1=unknown).
signature_verify_detached(s, pdf) End-to-end verify against full PDF bytes (1/0/-1).

Timestamp

timestamp_parse(data::AbstractVector{UInt8}) -> Timestamp
timestamp_get_token(t::Timestamp) -> Vector{UInt8}
timestamp_get_message_imprint(t::Timestamp) -> Vector{UInt8}
timestamp_get_time(t::Timestamp) -> Int
timestamp_get_serial(t::Timestamp) -> String
timestamp_get_tsa_name(t::Timestamp) -> String
timestamp_get_policy_oid(t::Timestamp) -> String
timestamp_get_hash_algorithm(t::Timestamp) -> Int
timestamp_verify(t::Timestamp) -> Bool
Function Description
timestamp_parse(data) Parse a DER RFC 3161 TimeStampToken (or bare TSTInfo).
timestamp_get_token(t) The raw token bytes.
timestamp_get_message_imprint(t) The message-imprint digest bytes.
timestamp_get_time(t) Timestamp time (Unix epoch seconds).
timestamp_get_serial(t) Serial number string.
timestamp_get_tsa_name(t) TSA name.
timestamp_get_policy_oid(t) Policy OID.
timestamp_get_hash_algorithm(t) Digest algorithm code.
timestamp_verify(t) Whether the token verifies.

TsaClient

tsa_client_create(url::AbstractString; username = nothing, password = nothing, timeout::Integer = 30, hash_algo::Integer = 0, use_nonce::Bool = true, cert_req::Bool = true) -> TsaClient
tsa_request_timestamp(t::TsaClient, data::AbstractVector{UInt8}) -> Timestamp
tsa_request_timestamp_hash(t::TsaClient, hash::AbstractVector{UInt8}, hash_algo::Integer) -> Timestamp
Function Description
tsa_client_create(url; …) Create an RFC 3161 TSA client (optional basic-auth, timeout, hash algo, nonce, cert-req).
tsa_request_timestamp(t, data) Request a timestamp over data (network I/O).
tsa_request_timestamp_hash(t, hash, hash_algo) Request a timestamp over a precomputed digest.

Dss (Document Security Store)

dss_cert_count(d::Dss) -> Int
dss_crl_count(d::Dss) -> Int
dss_ocsp_count(d::Dss) -> Int
dss_vri_count(d::Dss) -> Int
dss_get_cert(d::Dss, index::Integer) -> Vector{UInt8}
dss_get_crl(d::Dss, index::Integer) -> Vector{UInt8}
dss_get_ocsp(d::Dss, index::Integer) -> Vector{UInt8}
Function Description
dss_cert_count(d) / dss_crl_count(d) / dss_ocsp_count(d) / dss_vri_count(d) Counts of certificates / CRLs / OCSP responses / VRI entries.
dss_get_cert(d, index) / dss_get_crl(d, index) / dss_get_ocsp(d, index) The index-th certificate / CRL / OCSP response bytes.

Validation Results

validate_pdf_a / validate_pdf_ua / validate_pdf_x return PdfAResults / UaResults / PdfXResults.

Function Returns Description
is_compliant(r::PdfAResults) Bool Whether the document is PDF/A compliant.
is_compliant(r::PdfXResults) Bool Whether the document is PDF/X compliant.
is_accessible(r::UaResults) Bool Whether the document is PDF/UA accessible.
errors(r) Vector{String} Error messages (PdfAResults / UaResults / PdfXResults).
warnings(r) Vector{String} Warning messages.
ua_stats(r::UaResults) NamedTuple Accessibility element counts (structure, images, tables, forms, annotations, pages).
pdf_a_error_count(r) / pdf_a_warning_count(r) Int PDF/A error / warning counts.
pdf_ua_error_count(r) / pdf_ua_warning_count(r) Int PDF/UA error / warning counts.
pdf_x_error_count(r) Int PDF/X error count.

Barcodes

generate_qr_code(data::AbstractString, error_correction::Integer = 0, size_px::Integer = 256) -> Barcode
generate_barcode(data::AbstractString, format::Integer = 0, size_px::Integer = 256) -> Barcode
barcode_get_data(b::Barcode) -> String
barcode_get_format(b::Barcode) -> Int
barcode_get_confidence(b::Barcode) -> Float64
barcode_get_image_png(b::Barcode, size_px::Integer = 256) -> Vector{UInt8}
barcode_get_svg(b::Barcode, size_px::Integer = 256) -> String
Function Description
generate_qr_code(data, error_correction=0, size_px=256) Generate a QR code.
generate_barcode(data, format=0, size_px=256) Generate a 1D/2D barcode.
barcode_get_data(b) The decoded/encoded payload string.
barcode_get_format(b) The format code.
barcode_get_confidence(b) Decode confidence (0.0–1.0).
barcode_get_image_png(b, size_px=256) Render to PNG bytes.
barcode_get_svg(b, size_px=256) Render to an SVG string.

Stamp a barcode onto an editor page with add_barcode_to_page(e, page, b, x, y, width, height).


OCR

OcrEngine

ocr_engine_create(det_model_path::AbstractString, rec_model_path::AbstractString, dict_path::AbstractString) -> OcrEngine

Create an OCR engine from detection model, recognition model, and dictionary file paths. Use it with ocr_extract_text(doc, page, engine) and page_needs_ocr(doc, page) (see PdfDocument › OCR). close!(o::OcrEngine) frees it.

Model Prefetch

prefetch_models(languages_csv::AbstractString) -> String
prefetch_available() -> Int
model_manifest() -> String
Function Description
prefetch_models(languages_csv) Prefetch OCR/layout models for comma-separated languages.
prefetch_available() Whether model prefetch is available.
model_manifest() The bundled model manifest (JSON/string).

Global Configuration & Crypto

set_log_level(level::Integer)   # 0=Off 1=Error 2=Warn 3=Info 4=Debug 5=Trace
get_log_level() -> Int
set_max_ops_per_stream(limit::Integer) -> Int
set_preserve_unmapped_glyphs(preserve::Integer) -> Int
Function Description
set_log_level(level) Set the global log level (0–5).
get_log_level() Get the current global log level.
set_max_ops_per_stream(limit) Set the per-stream content-op limit; returns the previous limit.
set_preserve_unmapped_glyphs(preserve) Toggle preservation of unmapped glyphs; returns the previous setting.

Crypto Policy & Inventory

crypto_active_provider() -> String
crypto_cbom() -> String
crypto_inventory() -> String
crypto_policy() -> String
crypto_set_policy(spec::AbstractString) -> Int
crypto_fips_available() -> Int
crypto_use_fips() -> Int
Function Description
crypto_active_provider() The active crypto provider name.
crypto_cbom() Cryptographic Bill of Materials (JSON).
crypto_inventory() Crypto algorithm inventory (JSON).
crypto_policy() The active crypto policy.
crypto_set_policy(spec) Set the active crypto policy from spec; returns a status code.
crypto_fips_available() Whether a FIPS provider is available.
crypto_use_fips() Whether FIPS mode is active.

Error Handling

Every non-success C-ABI status throws a PdfOxideError:

using PdfOxide

try
    doc = open_document("file.pdf")
    text = extract_text(doc, 0)
    close!(doc)
catch e
    if e isa PdfOxideError
        @warn "PDF error" code=e.code op=e.op
    else
        rethrow()
    end
end

PdfOxideError <: Exception carries the numeric C-ABI code and the failing operation name.


Complete Example

using PdfOxide

# --- Creation ---
doc_bytes = let b = DocumentBuilder()
    set_title(b, "Report")
    p = letter_page(b)
    font(p, "Helvetica", 18)
    at(p, 72, 720)
    heading(p, 1, "Quarterly Report")
    paragraph(p, "Generated by PDF Oxide.")
    done(p)
    out = build(b)
    close!(b)
    out
end

# --- Extraction ---
doc = open_from_bytes(doc_bytes)
println("Pages: ", page_count(doc))
for i in 0:page_count(doc)-1
    println("Page ", i + 1, ": ", length(extract_text(doc, i)), " chars")
end

# Words + tables
words = extract_words(doc, 0)
tables = extract_tables(doc, 0)

# Search
for hit in search_all(doc, "report", false)
    println("Found on page ", hit.page, " at ", hit.bbox)
end
close!(doc)

# --- Editing ---
e = open_editor("input.pdf")
rotate_all_pages(e, 90)
set_form_field_value(e, "name", "Jane Doe")
flatten_forms(e)
save(e, "output.pdf")
close!(e)

# --- Rendering ---
doc = open_document("input.pdf")
img = render_page(doc, 0)          # PNG
save(img, "page0.png")
close!(img)
close!(doc)

Other Language Bindings

PDF Oxide ships native bindings for every major ecosystem: Rust, Python, Node.js, WASM, C#, Golang, Java, PHP, Ruby, C++, Swift, Kotlin, Dart, R, Zig, Scala, Clojure, Objective-C, and Elixir.

Next Steps