Julia API Reference
PDF Oxide provides idiomatic Julia bindings (PdfOxide.jl) layered directly over the pdf_oxide C ABI via ccall — no shim crate. Native handles are wrapped in mutable structs with finalizers, C strings and buffers are copied into Julia and freed automatically, and any non-success C-ABI status throws a PdfOxideError. Page indices are 0-based.
using Pkg
Pkg.add("PdfOxide")
The native libpdf_oxide cdylib is loaded at runtime and resolved via PDF_OXIDE_LIB_PATH → PDF_OXIDE_LIB_DIR → ../target/release → target/release → the system loader. Build it with the binding feature set:
cargo build --release --lib --features ocr,rendering,signatures,barcodes,tsa-client,system-fonts
using PdfOxide
pdf = from_markdown("# Hello\n\nbody\n")
doc = open_from_bytes(to_bytes(pdf))
page_count(doc)
extract_text(doc, 0) # 0-based page index
to_markdown_all(doc)
close!(doc)
For the Rust API, see the Rust API Reference. For the Python API, see the Python API Reference. For type details, see Types & Enums.
Most handle types (PdfDocument, Pdf, DocumentEditor, DocumentBuilder, RenderedImage, Certificate, SignatureInfo, Timestamp, TsaClient, Dss, validation results, Barcode, OcrEngine, Renderer, ElementList, EmbeddedFont, PageBuilder) own native memory. Free them eagerly with close!(x) — it is idempotent and also runs at finalization.
PdfDocument
The read-side handle for opening, extracting, searching, rendering, validating, and inspecting an existing PDF.
Opening
open_document(path::AbstractString; password::Union{Nothing,AbstractString} = nothing) -> PdfDocument
open_from_bytes(data::AbstractVector{UInt8}) -> PdfDocument
open_with_password(path::AbstractString, password::AbstractString) -> PdfDocument
| Function |
Description |
open_document(path; password=nothing) |
Open a PDF from a filesystem path (optionally password-protected). |
open_from_bytes(data) |
Open a PDF from an in-memory byte vector. |
open_with_password(path, password) |
Open an encrypted PDF on disk with a password. |
Document Info
| Function |
Returns |
Description |
page_count(d) |
Int |
Number of pages. |
version(d) |
PdfVersion |
PDF version (major/minor). |
is_encrypted(d) |
Bool |
Whether the document is encrypted. |
has_structure_tree(d) |
Bool |
Whether the document is a Tagged PDF with a structure tree. |
has_xfa(d) |
Bool |
Whether the document carries an XFA form. |
authenticate(d, password) |
Bool |
Authenticate against an encrypted document’s password (a wrong password returns false, not an error). |
Whole-Document Conversion
| Function |
Returns |
Description |
to_markdown_all(d) |
String |
Markdown for the whole document. |
to_html_all(d) |
String |
HTML for the whole document. |
to_plain_text_all(d) |
String |
Plain text for the whole document. |
extract_all_text(d) |
String |
Whole-document auto text extraction. |
classify_document(d) |
String |
Classify the whole document; returns the classifier’s JSON string. |
extract_text(d::PdfDocument, page::Integer) -> String
to_plain_text(d::PdfDocument, page::Integer) -> String
to_markdown(d::PdfDocument, page::Integer) -> String
to_html(d::PdfDocument, page::Integer) -> String
extract_structured_json(d::PdfDocument, page::Integer) -> String
| Function |
Description |
extract_text(d, page) |
Extract plain text from a 0-based page. |
to_plain_text(d, page) |
Render a page to layout-aware plain text. |
to_markdown(d, page) |
Render a page to Markdown. |
to_html(d, page) |
Render a page to HTML. |
extract_structured_json(d, page) |
Extract a structured-document JSON model for a page. |
extract_text_auto(d, page) |
Auto-pick the best text extraction for a page. |
extract_page_auto(d, page, options="{}") |
Auto page extraction with a JSON options string. |
classify_page(d, page) |
Classify a page; returns the classifier’s JSON string. |
extract_chars(d::PdfDocument, page::Integer) -> Vector{Char}
extract_words(d::PdfDocument, page::Integer) -> Vector{Word}
extract_text_lines(d::PdfDocument, page::Integer) -> Vector{TextLine}
extract_tables(d::PdfDocument, page::Integer) -> Vector{Table}
embedded_fonts(d::PdfDocument, page::Integer) -> Vector{Font}
embedded_images(d::PdfDocument, page::Integer) -> Vector{Image}
page_annotations(d::PdfDocument, page::Integer) -> Vector{Annotation}
extract_paths(d::PdfDocument, page::Integer) -> Vector{Path}
| Function |
Description |
extract_chars(d, page) |
Extract glyphs from a page as a Vector{Char}. |
extract_words(d, page) |
Extract words from a page as a Vector{Word}. |
extract_text_lines(d, page) |
Extract text lines as a Vector{TextLine}. |
extract_tables(d, page) |
Extract tables as a Vector{Table}. |
embedded_fonts(d, page) |
Embedded fonts as a Vector{Font}. |
embedded_images(d, page) |
Embedded images as a Vector{Image}. |
page_annotations(d, page) |
Annotations as a Vector{Annotation}. |
extract_paths(d, page) |
Vector paths as a Vector{Path}. |
extract_text_in_rect(d::PdfDocument, page::Integer, x, y, w, h) -> String
extract_words_in_rect(d::PdfDocument, page::Integer, x, y, w, h) -> Vector{Word}
extract_lines_in_rect(d::PdfDocument, page::Integer, x, y, w, h) -> Vector{TextLine}
extract_tables_in_rect(d::PdfDocument, page::Integer, x, y, w, h) -> Vector{Table}
extract_images_in_rect(d::PdfDocument, page::Integer, x, y, w, h) -> Vector{Image}
Each restricts extraction to the rectangle (x, y, w, h) in PDF user-space points on a 0-based page.
Search
search(d::PdfDocument, page::Integer, term::AbstractString, caseSensitive::Bool) -> Vector{SearchResult}
search_all(d::PdfDocument, term::AbstractString, caseSensitive::Bool) -> Vector{SearchResult}
search_results_to_json(d::PdfDocument, term::AbstractString, caseSensitive::Bool) -> String
| Function |
Description |
search(d, page, term, caseSensitive) |
Search a single page for term. |
search_all(d, term, caseSensitive) |
Search the whole document for term. |
search_results_to_json(d, term, caseSensitive) |
Serialize whole-document search results to JSON. |
Page Geometry
| Function |
Returns |
Description |
page_get_width(d, page) |
Float64 |
Page width (PDF points). |
page_get_height(d, page) |
Float64 |
Page height (PDF points). |
page_get_rotation(d, page) |
Int |
Absolute rotation (degrees) of a page. |
page(d, index) |
PdfPage |
A 0-based page view over the document (see PdfPage). |
| Function |
Returns |
Description |
get_outline(d) |
String |
Bookmarks / table of contents (JSON). |
get_page_labels(d) |
String |
Page-label ranges (JSON). |
get_xmp_metadata(d) |
String |
XMP metadata (JSON). |
get_source_bytes(d) |
Vector{UInt8} |
The document’s original source bytes. |
plan_split_by_bookmarks(d, options="{}") |
String |
Plan a split-by-bookmarks operation (JSON). |
Annotation Inspection
annotations_to_json(d::PdfDocument, page::Integer) -> String
annotation_get_color(d::PdfDocument, page::Integer, index::Integer) -> UInt32
annotation_creation_date(d::PdfDocument, page::Integer, index::Integer) -> String
annotation_modification_date(d::PdfDocument, page::Integer, index::Integer) -> String
| Function |
Returns |
Description |
annotations_to_json(d, page) |
String |
All annotations on a page as JSON. |
annotation_get_color(d, page, index) |
UInt32 |
32-bit packed RGBA color of an annotation. |
annotation_creation_date(d, page, index) |
String |
Annotation creation date. |
annotation_modification_date(d, page, index) |
String |
Annotation modification date. |
annotation_is_hidden(d, page, index) |
Bool |
Whether the annotation is hidden. |
annotation_is_marked_deleted(d, page, index) |
Bool |
Whether the annotation is marked deleted. |
annotation_is_printable(d, page, index) |
Bool |
Whether the annotation is printable. |
annotation_is_read_only(d, page, index) |
Bool |
Whether the annotation is read-only. |
link_annotation_uri(d, page, index) |
String |
URI of a Link annotation. |
text_annotation_icon_name(d, page, index) |
String |
Icon name of a Text (sticky-note) annotation. |
highlight_quad_points_count(d, page, index) |
Int |
Quad-point count of a Highlight annotation. |
highlight_quad_point(d, page, index, quad_index) |
NTuple{8,Float64} |
The quad_index-th quad (8 floats). |
Font & Element JSON Helpers
| Function |
Returns |
Description |
fonts_to_json(d, page) |
String |
Embedded fonts on a page as JSON. |
font_size(d, page, index) |
Float64 |
Font size of the index-th embedded font. |
page_get_elements(d, page) |
ElementList |
Page layout-region elements as an ElementList. |
get_form_fields(d::PdfDocument) -> Vector{FormField}
form_field_count(d::PdfDocument) -> Int
export_form_data_to_bytes(d::PdfDocument, format_type::Integer) -> Vector{UInt8}
import_form_data(d::PdfDocument, path::AbstractString) -> Int
form_import_from_file(d::PdfDocument, filename::AbstractString) -> Bool
| Function |
Description |
get_form_fields(d) |
All AcroForm fields as a Vector{FormField}. |
form_field_count(d) |
Convenience field count. |
export_form_data_to_bytes(d, format_type) |
Export form data (format_type selects FDF/XFDF). |
import_form_data(d, path) |
Import form data from a file; returns the C status code. |
form_import_from_file(d, filename) |
Import form data; returns true on success. |
OCR
page_needs_ocr(d::PdfDocument, page::Integer) -> Bool
ocr_extract_text(d::PdfDocument, page::Integer, engine::Union{Nothing,OcrEngine} = nothing) -> String
| Function |
Description |
page_needs_ocr(d, page) |
Whether a page is scanned/hybrid and needs OCR. |
ocr_extract_text(d, page, engine=nothing) |
Extract text via OCR (nothing falls back to native extraction only). See OcrEngine. |
Signatures (document-level)
sign(d::PdfDocument, cert::Certificate; reason::AbstractString = "", location::AbstractString = "") -> Int
get_signature_count(d::PdfDocument) -> Int
get_signature(d::PdfDocument, index::Integer) -> SignatureInfo
verify_all_signatures(d::PdfDocument) -> Int
has_timestamp(d::PdfDocument) -> Int
document_get_dss(d::PdfDocument) -> Union{Dss,Nothing}
| Function |
Description |
sign(d, cert; reason, location) |
Sign the document with cert; returns a status code. |
get_signature_count(d) |
Number of signatures in the document. |
get_signature(d, index) |
The index-th signature as a SignatureInfo. |
verify_all_signatures(d) |
Verify all signatures; returns a status code. |
has_timestamp(d) |
Whether the document carries a document-level timestamp. |
document_get_dss(d) |
Read the document /DSS into a Dss, or nothing if none. |
Validation & Conversion
validate_pdf_a(d::PdfDocument, level::Integer) -> PdfAResults
validate_pdf_ua(d::PdfDocument, level::Integer) -> UaResults
validate_pdf_x(d::PdfDocument, level::Integer) -> PdfXResults
document_convert_to_pdf_a(d::PdfDocument, level::Integer) -> Bool
validatePdfA / validatePdfUa / validatePdfX are camelCase aliases of the three validators. See Validation Results.
Office Conversion
to_docx(d::PdfDocument) -> Vector{UInt8}
to_pptx(d::PdfDocument) -> Vector{UInt8}
to_xlsx(d::PdfDocument) -> Vector{UInt8}
Convert the PDF back to a DOCX / PPTX / XLSX byte buffer. Opening Office files as PDFs is done with open_from_docx_bytes / open_from_pptx_bytes / open_from_xlsx_bytes (see Office Input).
Lifecycle
| Function |
Description |
close!(d) |
Free the native handle now (idempotent; also runs at finalization). |
PdfPage
A lightweight 0-based page view returned by page(doc, index). Methods delegate to the parent document.
p = page(doc, 0)
text(p) # -> String
markdown(p) # -> String
extract_words(p) # -> Vector{Word}
search(p, "term", false) # -> Vector{SearchResult}
render_page(p) # -> RenderedImage
| Method |
Returns |
Description |
text(p) |
String |
Plain text for the page. |
markdown(p) |
String |
Markdown for the page. |
html(p) |
String |
HTML for the page. |
plain_text(p) |
String |
Layout-aware plain text. |
extract_chars(p) |
Vector{Char} |
Glyphs. |
extract_words(p) |
Vector{Word} |
Words. |
extract_text_lines(p) |
Vector{TextLine} |
Text lines. |
extract_tables(p) |
Vector{Table} |
Tables. |
embedded_fonts(p) |
Vector{Font} |
Embedded fonts. |
embedded_images(p) |
Vector{Image} |
Embedded images. |
page_annotations(p) |
Vector{Annotation} |
Annotations. |
extract_paths(p) |
Vector{Path} |
Vector paths. |
search(p, term, caseSensitive) |
Vector{SearchResult} |
Search the page. |
render_page(p, format=0) |
RenderedImage |
Render the page (0=PNG). |
render_page_zoom(p, zoom, format=0) |
RenderedImage |
Render at a zoom factor. |
render_page_thumbnail(p, size, format=0) |
RenderedImage |
Render a thumbnail fitting size px. |
Rendering
Render PdfDocument pages to images. format: 0=PNG (default), 1=JPEG. Coordinates are in PDF user-space points.
render_page(d::PdfDocument, pageIndex::Integer, format::Integer = 0) -> RenderedImage
render_page_zoom(d::PdfDocument, pageIndex::Integer, zoom::Real, format::Integer = 0) -> RenderedImage
render_page_thumbnail(d::PdfDocument, pageIndex::Integer, size::Integer, format::Integer = 0) -> RenderedImage
render_page_region(d::PdfDocument, page::Integer, crop_x, crop_y, crop_width, crop_height, format::Integer = 0) -> RenderedImage
render_page_fit(d::PdfDocument, page::Integer, w::Integer, h::Integer, format::Integer = 0) -> RenderedImage
render_page_raw(d::PdfDocument, page::Integer, dpi::Integer) -> Tuple{RenderedImage,Int,Int}
render_page_with_options(d::PdfDocument, page::Integer, dpi, format, bg_r, bg_g, bg_b, bg_a, transparent_background, render_annotations, jpeg_quality) -> RenderedImage
render_page_with_options_ex(d::PdfDocument, page::Integer, dpi, format, bg_r, bg_g, bg_b, bg_a, transparent_background, render_annotations, jpeg_quality, excluded_layers::AbstractVector{<:AbstractString} = String[]) -> RenderedImage
| Function |
Description |
render_page(d, page, format=0) |
Render a page (0=PNG). |
render_page_zoom(d, page, zoom, format=0) |
Render at a zoom factor. |
render_page_thumbnail(d, page, size, format=0) |
Render a thumbnail fitting size px. |
render_page_region(d, page, x, y, w, h, format=0) |
Render a rectangular crop (user-space points). |
render_page_fit(d, page, w, h, format=0) |
Render to fit inside w×h px, preserving aspect ratio. |
render_page_raw(d, page, dpi) |
Render to a raw premultiplied RGBA8888 buffer; returns (image, width, height). |
render_page_with_options(d, page, dpi, format, bg_r, bg_g, bg_b, bg_a, transparent_background, render_annotations, jpeg_quality) |
Render with the full RenderOptions surface (background channels 0–1; flags are 0/1). |
render_page_with_options_ex(...; excluded_layers) |
As above plus OCG layer filtering — suppress the named /Name layers. |
estimate_render_time(d, page) |
Estimate render time (ms) for a page. |
renderPage / renderPageZoom / renderPageThumbnail are camelCase aliases of render_page / render_page_zoom / render_page_thumbnail.
Renderer
A reusable renderer handle.
create_renderer(dpi::Integer = 150, format::Integer = 0, quality::Integer = 90, anti_alias::Bool = true) -> Renderer
close!(r::Renderer)
RenderedImage
The result of a render call. Fields: width::Int, height::Int, data::Vector{UInt8} (encoded bytes, or raw RGBA for render_page_raw).
save(img::RenderedImage, path::AbstractString) # write to disk (format inferred)
close!(img::RenderedImage)
DocumentEditor
The edit-side handle for mutating and re-saving an existing PDF.
Opening & Source
open_editor(path::AbstractString) -> DocumentEditor
open_editor_from_bytes(data::AbstractVector{UInt8}) -> DocumentEditor
| Function |
Returns |
Description |
is_modified(e) |
Bool |
Whether the editor has unsaved modifications. |
get_source_path(e) |
String |
Source path the editor was opened from. |
page_count(e) |
Int |
Number of pages. |
version(e) |
PdfVersion |
PDF version. |
| Function |
Description |
get_producer(e) |
Producer from /Info.Producer. |
set_producer(e, value) |
Set /Info.Producer. |
get_creation_date(e) |
Creation date from /Info.CreationDate (raw PDF date string). |
set_creation_date(e, date_str) |
Set /Info.CreationDate (raw PDF date string). |
Saving
save(e::DocumentEditor, path::AbstractString)
save_to_bytes(e::DocumentEditor) -> Vector{UInt8}
save_to_bytes_with_options(e::DocumentEditor, compress::Bool, garbage_collect::Bool, linearize::Bool) -> Vector{UInt8}
extract_pages_to_bytes(e::DocumentEditor, pages::AbstractVector{<:Integer}) -> Vector{UInt8}
save_encrypted(e::DocumentEditor, path::AbstractString, user_password::AbstractString, owner_password::AbstractString)
save_encrypted_to_bytes(e::DocumentEditor, user_password::AbstractString, owner_password::AbstractString) -> Vector{UInt8}
convert_to_pdf_a(e::DocumentEditor, level::Integer)
| Function |
Description |
save(e, path) |
Save the edited document to a path. |
save_to_bytes(e) |
Serialize the edited document to bytes. |
save_to_bytes_with_options(e, compress, garbage_collect, linearize) |
Serialize with compression / GC / linearization options. |
extract_pages_to_bytes(e, pages) |
Extract a subset of pages to a new in-memory PDF. |
save_encrypted(e, path, user_pw, owner_pw) |
Save with AES-256 encryption to a path. |
save_encrypted_to_bytes(e, user_pw, owner_pw) |
Save with AES-256 encryption to bytes. |
convert_to_pdf_a(e, level) |
Convert to PDF/A in-place (level: 0=A1b…7=A3u). |
Merge & Attachments
| Function |
Description |
merge_from(e, source_path) |
Merge pages from a PDF on disk. |
merge_from_bytes(e, data) |
Merge pages from an in-memory PDF buffer. |
embed_file(e, name, data) |
Embed a file attachment (name, data bytes). |
Page Operations
rotate_all_pages(e::DocumentEditor, degrees::Integer)
rotate_page_by(e::DocumentEditor, page::Integer, degrees::Integer)
get_page_rotation(e::DocumentEditor, page::Integer) -> Int
set_page_rotation(e::DocumentEditor, page::Integer, degrees::Integer)
delete_page(e::DocumentEditor, page::Integer)
move_page(e::DocumentEditor, from::Integer, to::Integer)
| Function |
Description |
rotate_all_pages(e, degrees) |
Rotate all pages (relative). |
rotate_page_by(e, page, degrees) |
Rotate a page additively. |
get_page_rotation(e, page) |
Absolute rotation of a page. |
set_page_rotation(e, page, degrees) |
Set absolute rotation of a page. |
delete_page(e, page) |
Delete a page. |
move_page(e, from, to) |
Move a page from one index to another. |
Page Boxes & Cropping
get_page_media_box(e::DocumentEditor, page::Integer) -> NTuple{4,Float64}
get_page_crop_box(e::DocumentEditor, page::Integer) -> NTuple{4,Float64}
set_page_media_box(e::DocumentEditor, page::Integer, x, y, w, h)
set_page_crop_box(e::DocumentEditor, page::Integer, x, y, w, h)
crop_margins(e::DocumentEditor, left::Real, right::Real, top::Real, bottom::Real)
| Function |
Description |
get_page_media_box(e, page) |
Get the MediaBox of a page. |
get_page_crop_box(e, page) |
Get the CropBox of a page. |
set_page_media_box(e, page, x, y, w, h) |
Set the MediaBox of a page. |
set_page_crop_box(e, page, x, y, w, h) |
Set the CropBox of a page. |
crop_margins(e, left, right, top, bottom) |
Crop all pages by margins (user-space). |
Erase / Whiteout
erase_region(e::DocumentEditor, page::Integer, x, y, w, h)
erase_regions(e::DocumentEditor, page::Integer, rects::AbstractVector{<:NTuple{4,<:Real}})
clear_erase_regions(e::DocumentEditor, page::Integer)
| Function |
Description |
erase_region(e, page, x, y, w, h) |
Erase one rectangular region. |
erase_regions(e, page, rects) |
Erase multiple regions (rects is a vector of (x, y, w, h) tuples). |
clear_erase_regions(e, page) |
Clear all pending erase-region entries for a page. |
erase_header(d::PdfDocument, page::Integer)
erase_footer(d::PdfDocument, page::Integer)
erase_artifacts(d::PdfDocument, page::Integer)
remove_headers(d::PdfDocument, threshold::Real = 0.5)
remove_footers(d::PdfDocument, threshold::Real = 0.5)
remove_artifacts(d::PdfDocument, threshold::Real = 0.5)
| Function |
Description |
erase_header(d, page) / erase_footer(d, page) / erase_artifacts(d, page) |
Erase the detected header / footer / artifacts on a page. |
remove_headers(d, threshold=0.5) / remove_footers(...) / remove_artifacts(...) |
Remove repeating headers / footers / artifacts across the document above a frequency threshold. |
Annotation Flatten
flatten_annotations(e::DocumentEditor, page::Integer)
flatten_all_annotations(e::DocumentEditor)
is_page_marked_for_flatten(e::DocumentEditor, page::Integer) -> Bool
unmark_page_for_flatten(e::DocumentEditor, page::Integer)
Redaction
apply_page_redactions(e::DocumentEditor, page::Integer)
apply_all_redactions(e::DocumentEditor)
is_page_marked_for_redaction(e::DocumentEditor, page::Integer) -> Bool
unmark_page_for_redaction(e::DocumentEditor, page::Integer)
redaction_add(e::DocumentEditor, page::Integer, x1, y1, x2, y2, r, g, b)
redaction_count(e::DocumentEditor, page::Integer) -> Int
redaction_apply(e::DocumentEditor, scrub_metadata::Bool, r::Real, g::Real, b::Real) -> Int
redaction_scrub_metadata(e::DocumentEditor) -> Int
| Function |
Description |
apply_page_redactions(e, page) |
Apply (burn in) redactions on a page. |
apply_all_redactions(e) |
Apply all pending redactions. |
is_page_marked_for_redaction(e, page) |
Whether a page is marked for redaction. |
unmark_page_for_redaction(e, page) |
Remove the redaction mark from a page. |
redaction_add(e, page, x1, y1, x2, y2, r, g, b) |
Queue a redaction box with overlay color (DeviceRGB, 0–1). |
redaction_count(e, page) |
Number of queued redaction regions on a page. |
redaction_apply(e, scrub_metadata, r, g, b) |
Destructively apply all queued redactions; returns glyphs purged. |
redaction_scrub_metadata(e) |
Strip Info/XMP/JavaScript/EmbeddedFiles; returns count removed. |
set_form_field_value(e::DocumentEditor, name::AbstractString, value::AbstractString)
flatten_forms(e::DocumentEditor)
flatten_forms_on_page(e::DocumentEditor, page::Integer)
flatten_warnings_count(e::DocumentEditor) -> Int
flatten_warning(e::DocumentEditor, index::Integer) -> String
import_fdf_bytes(e::DocumentEditor, data::AbstractVector{UInt8})
import_xfdf_bytes(e::DocumentEditor, data::AbstractVector{UInt8})
| Function |
Description |
set_form_field_value(e, name, value) |
Set a form field value (UTF-8). |
flatten_forms(e) |
Flatten all forms (bake values into page content). |
flatten_forms_on_page(e, page) |
Flatten forms on a single page. |
flatten_warnings_count(e) |
Number of warnings from the last form-flatten save. |
flatten_warning(e, index) |
The index-th flatten warning string. |
import_fdf_bytes(e, data) |
Import FDF form data from bytes. |
import_xfdf_bytes(e, data) |
Import XFDF form data from bytes. |
Barcode Stamping
add_barcode_to_page(e::DocumentEditor, page::Integer, b::Barcode, x, y, width, height)
Stamp a Barcode onto a page at rect (x, y, width, height). See Barcodes.
Lifecycle
close!(e::DocumentEditor) frees the handle.
Pdf
The lightweight create-side handle returned by the from_* factories.
Factories
from_markdown(input::AbstractString) -> Pdf
from_html(input::AbstractString) -> Pdf
from_text(input::AbstractString) -> Pdf
from_image(path::AbstractString) -> Pdf
from_image_bytes(data::AbstractVector{UInt8}) -> Pdf
from_html_css(html::AbstractString, css::AbstractString, font_bytes::Union{Nothing,AbstractVector{UInt8}} = nothing) -> Pdf
from_html_css_with_fonts(html::AbstractString, css::AbstractString, families::AbstractVector{<:AbstractString}, fonts::AbstractVector{<:AbstractVector{UInt8}}) -> Pdf
| Function |
Description |
from_markdown(input) |
Build a Pdf from Markdown. |
from_html(input) |
Build a Pdf from HTML. |
from_text(input) |
Build a Pdf from plain text. |
from_image(path) |
Build a Pdf from an image file. |
from_image_bytes(data) |
Build a Pdf from in-memory image bytes. |
from_html_css(html, css, font_bytes=nothing) |
Build from HTML + CSS with one optional embedded font. |
from_html_css_with_fonts(html, css, families, fonts) |
Build from HTML + CSS with a multi-font cascade (families[i] names fonts[i]). |
Methods
| Function |
Returns |
Description |
save(p, path) |
— |
Write the built PDF to a path. |
to_bytes(p) |
Vector{UInt8} |
Serialize the built PDF to bytes. |
get_page_count(p) |
Int |
Page count of a builder Pdf. |
close!(p) |
— |
Free the handle. |
merge_pdfs
merge_pdfs(paths::AbstractVector{<:AbstractString}) -> Vector{UInt8}
Merge the PDFs at paths (in order) into a single PDF byte buffer.
open_from_docx_bytes(data::AbstractVector{UInt8}) -> Pdf
open_from_pptx_bytes(data::AbstractVector{UInt8}) -> Pdf
open_from_xlsx_bytes(data::AbstractVector{UInt8}) -> Pdf
Convert DOCX / PPTX / XLSX bytes into a Pdf.
DocumentBuilder
A fluent, structure-aware PDF builder. Create pages with a4_page / letter_page / page, lay out content on the returned PageBuilder, call done to commit each page, then build / save / encrypted variants.
DocumentBuilder() -> DocumentBuilder
Document-Level Configuration
set_title(b::DocumentBuilder, value::AbstractString)
set_author(b::DocumentBuilder, value::AbstractString)
set_subject(b::DocumentBuilder, value::AbstractString)
set_keywords(b::DocumentBuilder, value::AbstractString)
set_creator(b::DocumentBuilder, value::AbstractString)
on_open(b::DocumentBuilder, value::AbstractString)
language(b::DocumentBuilder, value::AbstractString)
tagged_pdf_ua1(b::DocumentBuilder)
role_map(b::DocumentBuilder, custom::AbstractString, standard::AbstractString)
register_embedded_font(b::DocumentBuilder, name::AbstractString, f::EmbeddedFont)
| Function |
Description |
set_title / set_author / set_subject / set_keywords / set_creator |
Set the corresponding /Info metadata field. |
on_open(b, value) |
Set a document-open JavaScript action. |
language(b, value) |
Set the document language (e.g. "en-US"). |
tagged_pdf_ua1(b) |
Enable PDF/UA-1 tagged-PDF mode. |
role_map(b, custom, standard) |
Map a custom structure type to a standard one. |
register_embedded_font(b, name, f) |
Register a loaded EmbeddedFont under name (consumes the font). |
Pages
a4_page(b::DocumentBuilder) -> PageBuilder
letter_page(b::DocumentBuilder) -> PageBuilder
page(b::DocumentBuilder, width::Real, height::Real) -> PageBuilder
Output
build(b::DocumentBuilder) -> Vector{UInt8}
save(b::DocumentBuilder, path::AbstractString)
save_encrypted_builder(b::DocumentBuilder, path::AbstractString, user_password::AbstractString, owner_password::AbstractString)
to_bytes_encrypted(b::DocumentBuilder, user_password::AbstractString, owner_password::AbstractString) -> Vector{UInt8}
| Function |
Description |
build(b) |
Build and return the PDF bytes (the builder must still be closed). |
save(b, path) |
Build and save to a path. |
save_encrypted_builder(b, path, user_pw, owner_pw) |
Build and save with AES-256 encryption. |
to_bytes_encrypted(b, user_pw, owner_pw) |
Build encrypted bytes (AES-256). |
EmbeddedFont
embedded_font_from_file(path::AbstractString) -> EmbeddedFont
embedded_font_from_bytes(data::AbstractVector{UInt8}; name::Union{Nothing,AbstractString} = nothing) -> EmbeddedFont
| Function |
Description |
embedded_font_from_file(path) |
Load a TTF/OTF font from a path. |
embedded_font_from_bytes(data; name) |
Load a font from bytes (name may be empty to use the PostScript name). |
PageBuilder
Returned by a4_page / letter_page / page. All methods mutate the page in place; call done(p) to commit it to the parent builder (this consumes the page handle).
Text & Layout
font(p::PageBuilder, name::AbstractString, size::Real)
at(p::PageBuilder, x::Real, y::Real)
heading(p::PageBuilder, level::Integer, text::AbstractString)
| Method |
Description |
font(p, name, size) |
Set the font + size for subsequent text. |
at(p, x, y) |
Move the cursor to absolute (x, y) (points, from lower-left). |
heading(p, level, text) |
Emit a heading at level (1–6). |
The following string-valued methods all share the signature f(p::PageBuilder, value::AbstractString):
| Method |
Description |
text(p, value) |
Emit a run of body text. |
paragraph(p, value) |
Emit a wrapped paragraph. |
link_url(p, value) |
Make the previous text a URL link. |
link_named(p, value) |
Link the previous text to a named destination. |
link_javascript(p, value) |
Attach a JavaScript action to the previous text. |
on_open(p, value) |
Set a page-open JavaScript action. |
on_close(p, value) |
Set a page-close JavaScript action. |
field_keystroke(p, value) / field_format(p, value) / field_validate(p, value) / field_calculate(p, value) |
Attach AcroForm field JavaScript actions. |
sticky_note(p, value) |
Attach a sticky note to the previous content. |
watermark(p, value) |
Add a text watermark. |
stamp(p, value) |
Add a stamp annotation. |
inline(p, value) |
Append an inline text run. |
inline_bold(p, value) |
Append a bold inline run. |
inline_italic(p, value) |
Append an italic inline run. |
Zero-argument layout methods, all f(p::PageBuilder):
| Method |
Description |
horizontal_rule(p) |
Draw a horizontal rule. |
space(p) |
Insert vertical space. |
newline(p) |
Break to a new line. |
new_page_same_size(p) |
Start a new page with the same dimensions. |
watermark_confidential(p) |
Add a “CONFIDENTIAL” watermark. |
watermark_draft(p) |
Add a “DRAFT” watermark. |
link_page(p::PageBuilder, page_index::Integer)
sticky_note_at(p::PageBuilder, x::Real, y::Real, text::AbstractString)
freetext(p::PageBuilder, x::Real, y::Real, w::Real, h::Real, text::AbstractString)
footnote(p::PageBuilder, ref_mark::AbstractString, note_text::AbstractString)
columns(p::PageBuilder, column_count::Integer, gap_pt::Real, text::AbstractString)
inline_color(p::PageBuilder, r::Real, g::Real, b_::Real, text::AbstractString)
| Method |
Description |
link_page(p, page_index) |
Link the previous text to an internal page index. |
sticky_note_at(p, x, y, text) |
Place a free-standing sticky note. |
freetext(p, x, y, w, h, text) |
Place a free-flowing text annotation in a rect. |
footnote(p, ref_mark, note_text) |
Add a footnote (inline superscript + body at page end). |
columns(p, column_count, gap_pt, text) |
Lay out text across balanced columns. |
inline_color(p, r, g, b_, text) |
Append an RGB-colored inline run (channels 0–1). |
Markup Annotations
All four share f(p::PageBuilder, r::Real, g::Real, b_::Real) and apply to the previous text with an RGB color:
| Method |
Description |
highlight(p, r, g, b) |
Highlight the previous text. |
underline(p, r, g, b) |
Underline the previous text. |
strikeout(p, r, g, b) |
Strike out the previous text. |
squiggly(p, r, g, b) |
Squiggly-underline the previous text. |
text_field(p::PageBuilder, name, x, y, w, h; default_value::Union{Nothing,AbstractString} = nothing)
checkbox(p::PageBuilder, name, x, y, w, h, checked::Bool)
combo_box(p::PageBuilder, name, x, y, w, h, options::AbstractVector{<:AbstractString}; selected::Union{Nothing,AbstractString} = nothing)
radio_group(p::PageBuilder, name, values, xs, ys, ws, hs; selected::Union{Nothing,AbstractString} = nothing)
push_button(p::PageBuilder, name, x, y, w, h, caption::AbstractString)
signature_field(p::PageBuilder, name, x, y, w, h)
| Method |
Description |
text_field(p, name, x, y, w, h; default_value) |
Add a single-line text field. |
checkbox(p, name, x, y, w, h, checked) |
Add a checkbox with an initial ticked state. |
combo_box(p, name, x, y, w, h, options; selected) |
Add a dropdown combo box. |
radio_group(p, name, values, xs, ys, ws, hs; selected) |
Add a radio group (parallel arrays describe each button). |
push_button(p, name, x, y, w, h, caption) |
Add a clickable push button. |
signature_field(p, name, x, y, w, h) |
Add an unsigned signature placeholder. |
Images & Barcodes
image(p::PageBuilder, bytes::AbstractVector{UInt8}, x, y, w, h)
image_with_alt(p::PageBuilder, bytes::AbstractVector{UInt8}, x, y, w, h, alt_text::AbstractString)
image_artifact(p::PageBuilder, bytes::AbstractVector{UInt8}, x, y, w, h)
barcode_1d(p::PageBuilder, barcode_type::Integer, data::AbstractString, x, y, w, h)
barcode_qr(p::PageBuilder, data::AbstractString, x, y, size::Real)
| Method |
Description |
image(p, bytes, x, y, w, h) |
Embed an image (raw JPEG/PNG/WebP). |
image_with_alt(p, bytes, x, y, w, h, alt_text) |
Embed an image with accessibility alt text. |
image_artifact(p, bytes, x, y, w, h) |
Embed a decorative image as an /Artifact. |
barcode_1d(p, barcode_type, data, x, y, w, h) |
Place a 1-D barcode (barcode_type: 0=Code128…7=Codabar). |
barcode_qr(p, data, x, y, size) |
Place a square QR code. |
Vector Graphics
rect(p::PageBuilder, x, y, w, h)
filled_rect(p::PageBuilder, x, y, w, h, r, g, b_)
line(p::PageBuilder, x1, y1, x2, y2)
stroke_rect(p::PageBuilder, x, y, w, h, width, r, g, b_)
stroke_line(p::PageBuilder, x1, y1, x2, y2, width, r, g, b_)
stroke_rect_dashed(p::PageBuilder, x, y, w, h, width, r, g, b_, dash_array::AbstractVector{<:Real}, phase::Real)
stroke_line_dashed(p::PageBuilder, x1, y1, x2, y2, width, r, g, b_, dash_array::AbstractVector{<:Real}, phase::Real)
text_in_rect(p::PageBuilder, x, y, w, h, text::AbstractString, align::Integer)
| Method |
Description |
rect(p, x, y, w, h) |
Stroked 1pt black rectangle outline. |
filled_rect(p, x, y, w, h, r, g, b) |
Filled rectangle in RGB (0–1). |
line(p, x1, y1, x2, y2) |
1pt black line. |
stroke_rect(p, x, y, w, h, width, r, g, b) |
Stroked rectangle, widthpt RGB. |
stroke_line(p, x1, y1, x2, y2, width, r, g, b) |
Stroked line, widthpt RGB. |
stroke_rect_dashed(...) |
Dashed stroked rectangle (dash_array on/off lengths, phase offset). |
stroke_line_dashed(...) |
Dashed stroked line. |
text_in_rect(p, x, y, w, h, text, align) |
Draw text inside a rect (align: 0=Left, 1=Center, 2=Right). |
Tables
table(p::PageBuilder, n_columns::Integer, widths::AbstractVector{<:Real}, aligns::AbstractVector{<:Integer}, n_rows::Integer, cell_strings::AbstractMatrix{<:AbstractString}, has_header::Bool)
streaming_table_begin(p::PageBuilder, headers::AbstractVector{<:AbstractString}, widths::AbstractVector{<:Real}, aligns::AbstractVector{<:Integer}, repeat_header::Bool)
streaming_table_begin_v2(p::PageBuilder, headers, widths, aligns, repeat_header::Bool, mode::Integer, sample_rows::Integer, min_col_width_pt::Real, max_col_width_pt::Real, max_rowspan::Integer)
streaming_table_set_batch_size(p::PageBuilder, batch_size::Integer)
streaming_table_pending_row_count(p::PageBuilder) -> Int
streaming_table_batch_count(p::PageBuilder) -> Int
streaming_table_push_row(p::PageBuilder, cells::AbstractVector{<:AbstractString})
streaming_table_push_row_v2(p::PageBuilder, cells::AbstractVector{<:AbstractString}, rowspans::AbstractVector{<:Integer})
streaming_table_flush(p::PageBuilder)
streaming_table_finish(p::PageBuilder)
| Method |
Description |
table(p, n_columns, widths, aligns, n_rows, cell_strings, has_header) |
Emit a buffered table (aligns: 0=Left/1=Center/2=Right; cell_strings is row-major n_rows × n_columns). |
streaming_table_begin(p, headers, widths, aligns, repeat_header) |
Open a streaming table (parallel length-n_columns arrays). |
streaming_table_begin_v2(...) |
Open a streaming table with width mode (0=Fixed, 1=Sample, 2=AutoAll) and max_rowspan. |
streaming_table_set_batch_size(p, batch_size) |
Set the batch size (0 → 256). |
streaming_table_pending_row_count(p) |
Rows pushed since the last batch boundary. |
streaming_table_batch_count(p) |
Number of complete batches so far. |
streaming_table_push_row(p, cells) |
Push one row (rowspan=1). |
streaming_table_push_row_v2(p, cells, rowspans) |
Push one row with per-cell rowspans (≥2 spans). |
streaming_table_flush(p) |
Flush the current batch. |
streaming_table_finish(p) |
Finish the streaming table. |
Commit
done(p::PageBuilder) # commit this page's buffered operations to its parent builder (consumes the handle)
Value Types
Immutable structs returned by extraction. Bbox has fields x, y, width, height (PDF user-space units).
| Type |
Fields |
Bbox |
x, y, width, height (Float64) |
Char |
character::UInt32, bbox::Bbox, font_name::String, font_size::Float64 |
Word |
text, bbox, font_name, font_size, bold |
TextLine |
text, bbox, word_count |
Table |
row_count, col_count, has_header, cells (use cell(t, row, col) for a 0-based cell) |
Font |
name, type, encoding, embedded, subset |
Image |
width, height, bitsPerComponent, format, colorspace, data |
Annotation |
type, subtype, content, author, rect::Bbox, borderWidth |
Path |
bbox::Bbox, strokeWidth, hasStroke, hasFill, operationCount |
SearchResult |
text, page, bbox::Bbox |
FormField |
name, value, type, readonly, required |
PdfVersion |
major::Int, minor::Int |
cell(t::Table, row::Integer, col::Integer) -> String
form_field_name(f::FormField) -> String
form_field_value(f::FormField) -> String
form_field_type(f::FormField) -> String
form_field_is_readonly(f::FormField) -> Bool
form_field_is_required(f::FormField) -> Bool
ElementList
element_count(l::ElementList) -> Int
element_type(l::ElementList, index::Integer) -> String
element_text(l::ElementList, index::Integer) -> String
element_rect(l::ElementList, index::Integer) -> Bbox
elements_to_json(l::ElementList) -> String
close!(l::ElementList)
Digital Signatures
Certificate
certificate_load_from_bytes(data::AbstractVector{UInt8}, password::AbstractString = "") -> Certificate
certificate_load_from_pem(cert_pem::AbstractString, key_pem::AbstractString) -> Certificate
certificate_get_subject(c::Certificate) -> String
certificate_get_issuer(c::Certificate) -> String
certificate_get_serial(c::Certificate) -> String
certificate_get_validity(c::Certificate) -> Tuple{Int,Int}
certificate_is_valid(c::Certificate) -> Bool
| Function |
Description |
certificate_load_from_bytes(data, password="") |
Load signing credentials from PKCS#12 / PFX bytes. |
certificate_load_from_pem(cert_pem, key_pem) |
Load from PEM-encoded certificate + private key. |
certificate_get_subject(c) / certificate_get_issuer(c) / certificate_get_serial(c) |
Subject / issuer / serial strings. |
certificate_get_validity(c) |
Validity window as (not_before, not_after) Unix epoch seconds. |
certificate_is_valid(c) |
Whether the certificate is currently valid. |
Signing
sign_bytes(pdf::AbstractVector{UInt8}, cert::Certificate, reason::AbstractString, location::AbstractString) -> Vector{UInt8}
sign_bytes_pades(pdf::AbstractVector{UInt8}, cert::Certificate, level::Integer, tsa_url::Union{Nothing,AbstractString}, reason::AbstractString, location::AbstractString; certs = Vector{UInt8}[], crls = Vector{UInt8}[], ocsps = Vector{UInt8}[]) -> Vector{UInt8}
sign_bytes_pades_opts(pdf::AbstractVector{UInt8}, cert::Certificate, level::Integer, tsa_url, reason, location; certs = Vector{UInt8}[], crls = Vector{UInt8}[], ocsps = Vector{UInt8}[]) -> Vector{UInt8}
add_timestamp(pdf_data::AbstractVector{UInt8}, sig_index::Integer, tsa_url::AbstractString) -> Vector{UInt8}
| Function |
Description |
sign_bytes(pdf, cert, reason, location) |
Sign raw PDF bytes; returns the signed PDF. |
sign_bytes_pades(pdf, cert, level, tsa_url, reason, location; certs, crls, ocsps) |
PAdES baseline signing (level: 0=B-B, 1=B-T, 2=B-LT; tsa_url required for level ≥ 1). |
sign_bytes_pades_opts(...) |
Struct-options variant of sign_bytes_pades (identical behavior, builds PadesSignOptionsC). |
add_timestamp(pdf_data, sig_index, tsa_url) |
Add an RFC 3161 timestamp to a signature; returns the timestamped PDF. |
SignatureInfo
signature_get_signer_name(s::SignatureInfo) -> String
signature_get_signing_reason(s::SignatureInfo) -> String
signature_get_signing_location(s::SignatureInfo) -> String
signature_get_signing_time(s::SignatureInfo) -> Int
signature_get_certificate(s::SignatureInfo) -> Certificate
signature_get_pades_level(s::SignatureInfo) -> Int
signature_has_timestamp(s::SignatureInfo) -> Bool
signature_get_timestamp(s::SignatureInfo) -> Timestamp
signature_add_timestamp(s::SignatureInfo, ts) -> Bool
signature_verify(s::SignatureInfo) -> Int
signature_verify_detached(s::SignatureInfo, pdf::AbstractVector{UInt8}) -> Int
| Function |
Description |
signature_get_signer_name(s) / _reason(s) / _location(s) |
Signer name / reason / location. |
signature_get_signing_time(s) |
Signing time (Unix epoch seconds). |
signature_get_certificate(s) |
Signer’s Certificate (owned). |
signature_get_pades_level(s) |
PAdES level code. |
signature_has_timestamp(s) |
Whether an embedded RFC 3161 timestamp is present. |
signature_get_timestamp(s) |
The embedded Timestamp (owned). |
signature_add_timestamp(s, ts) |
Attach a Timestamp; returns true on success. |
signature_verify(s) |
Signer-attributes crypto check (1=valid, 0=invalid, -1=unknown). |
signature_verify_detached(s, pdf) |
End-to-end verify against full PDF bytes (1/0/-1). |
Timestamp
timestamp_parse(data::AbstractVector{UInt8}) -> Timestamp
timestamp_get_token(t::Timestamp) -> Vector{UInt8}
timestamp_get_message_imprint(t::Timestamp) -> Vector{UInt8}
timestamp_get_time(t::Timestamp) -> Int
timestamp_get_serial(t::Timestamp) -> String
timestamp_get_tsa_name(t::Timestamp) -> String
timestamp_get_policy_oid(t::Timestamp) -> String
timestamp_get_hash_algorithm(t::Timestamp) -> Int
timestamp_verify(t::Timestamp) -> Bool
| Function |
Description |
timestamp_parse(data) |
Parse a DER RFC 3161 TimeStampToken (or bare TSTInfo). |
timestamp_get_token(t) |
The raw token bytes. |
timestamp_get_message_imprint(t) |
The message-imprint digest bytes. |
timestamp_get_time(t) |
Timestamp time (Unix epoch seconds). |
timestamp_get_serial(t) |
Serial number string. |
timestamp_get_tsa_name(t) |
TSA name. |
timestamp_get_policy_oid(t) |
Policy OID. |
timestamp_get_hash_algorithm(t) |
Digest algorithm code. |
timestamp_verify(t) |
Whether the token verifies. |
TsaClient
tsa_client_create(url::AbstractString; username = nothing, password = nothing, timeout::Integer = 30, hash_algo::Integer = 0, use_nonce::Bool = true, cert_req::Bool = true) -> TsaClient
tsa_request_timestamp(t::TsaClient, data::AbstractVector{UInt8}) -> Timestamp
tsa_request_timestamp_hash(t::TsaClient, hash::AbstractVector{UInt8}, hash_algo::Integer) -> Timestamp
| Function |
Description |
tsa_client_create(url; …) |
Create an RFC 3161 TSA client (optional basic-auth, timeout, hash algo, nonce, cert-req). |
tsa_request_timestamp(t, data) |
Request a timestamp over data (network I/O). |
tsa_request_timestamp_hash(t, hash, hash_algo) |
Request a timestamp over a precomputed digest. |
Dss (Document Security Store)
dss_cert_count(d::Dss) -> Int
dss_crl_count(d::Dss) -> Int
dss_ocsp_count(d::Dss) -> Int
dss_vri_count(d::Dss) -> Int
dss_get_cert(d::Dss, index::Integer) -> Vector{UInt8}
dss_get_crl(d::Dss, index::Integer) -> Vector{UInt8}
dss_get_ocsp(d::Dss, index::Integer) -> Vector{UInt8}
| Function |
Description |
dss_cert_count(d) / dss_crl_count(d) / dss_ocsp_count(d) / dss_vri_count(d) |
Counts of certificates / CRLs / OCSP responses / VRI entries. |
dss_get_cert(d, index) / dss_get_crl(d, index) / dss_get_ocsp(d, index) |
The index-th certificate / CRL / OCSP response bytes. |
Validation Results
validate_pdf_a / validate_pdf_ua / validate_pdf_x return PdfAResults / UaResults / PdfXResults.
| Function |
Returns |
Description |
is_compliant(r::PdfAResults) |
Bool |
Whether the document is PDF/A compliant. |
is_compliant(r::PdfXResults) |
Bool |
Whether the document is PDF/X compliant. |
is_accessible(r::UaResults) |
Bool |
Whether the document is PDF/UA accessible. |
errors(r) |
Vector{String} |
Error messages (PdfAResults / UaResults / PdfXResults). |
warnings(r) |
Vector{String} |
Warning messages. |
ua_stats(r::UaResults) |
NamedTuple |
Accessibility element counts (structure, images, tables, forms, annotations, pages). |
pdf_a_error_count(r) / pdf_a_warning_count(r) |
Int |
PDF/A error / warning counts. |
pdf_ua_error_count(r) / pdf_ua_warning_count(r) |
Int |
PDF/UA error / warning counts. |
pdf_x_error_count(r) |
Int |
PDF/X error count. |
Barcodes
generate_qr_code(data::AbstractString, error_correction::Integer = 0, size_px::Integer = 256) -> Barcode
generate_barcode(data::AbstractString, format::Integer = 0, size_px::Integer = 256) -> Barcode
barcode_get_data(b::Barcode) -> String
barcode_get_format(b::Barcode) -> Int
barcode_get_confidence(b::Barcode) -> Float64
barcode_get_image_png(b::Barcode, size_px::Integer = 256) -> Vector{UInt8}
barcode_get_svg(b::Barcode, size_px::Integer = 256) -> String
| Function |
Description |
generate_qr_code(data, error_correction=0, size_px=256) |
Generate a QR code. |
generate_barcode(data, format=0, size_px=256) |
Generate a 1D/2D barcode. |
barcode_get_data(b) |
The decoded/encoded payload string. |
barcode_get_format(b) |
The format code. |
barcode_get_confidence(b) |
Decode confidence (0.0–1.0). |
barcode_get_image_png(b, size_px=256) |
Render to PNG bytes. |
barcode_get_svg(b, size_px=256) |
Render to an SVG string. |
Stamp a barcode onto an editor page with add_barcode_to_page(e, page, b, x, y, width, height).
OCR
OcrEngine
ocr_engine_create(det_model_path::AbstractString, rec_model_path::AbstractString, dict_path::AbstractString) -> OcrEngine
Create an OCR engine from detection model, recognition model, and dictionary file paths. Use it with ocr_extract_text(doc, page, engine) and page_needs_ocr(doc, page) (see PdfDocument › OCR). close!(o::OcrEngine) frees it.
Model Prefetch
prefetch_models(languages_csv::AbstractString) -> String
prefetch_available() -> Int
model_manifest() -> String
| Function |
Description |
prefetch_models(languages_csv) |
Prefetch OCR/layout models for comma-separated languages. |
prefetch_available() |
Whether model prefetch is available. |
model_manifest() |
The bundled model manifest (JSON/string). |
Global Configuration & Crypto
set_log_level(level::Integer) # 0=Off 1=Error 2=Warn 3=Info 4=Debug 5=Trace
get_log_level() -> Int
set_max_ops_per_stream(limit::Integer) -> Int
set_preserve_unmapped_glyphs(preserve::Integer) -> Int
| Function |
Description |
set_log_level(level) |
Set the global log level (0–5). |
get_log_level() |
Get the current global log level. |
set_max_ops_per_stream(limit) |
Set the per-stream content-op limit; returns the previous limit. |
set_preserve_unmapped_glyphs(preserve) |
Toggle preservation of unmapped glyphs; returns the previous setting. |
Crypto Policy & Inventory
crypto_active_provider() -> String
crypto_cbom() -> String
crypto_inventory() -> String
crypto_policy() -> String
crypto_set_policy(spec::AbstractString) -> Int
crypto_fips_available() -> Int
crypto_use_fips() -> Int
| Function |
Description |
crypto_active_provider() |
The active crypto provider name. |
crypto_cbom() |
Cryptographic Bill of Materials (JSON). |
crypto_inventory() |
Crypto algorithm inventory (JSON). |
crypto_policy() |
The active crypto policy. |
crypto_set_policy(spec) |
Set the active crypto policy from spec; returns a status code. |
crypto_fips_available() |
Whether a FIPS provider is available. |
crypto_use_fips() |
Whether FIPS mode is active. |
Error Handling
Every non-success C-ABI status throws a PdfOxideError:
using PdfOxide
try
doc = open_document("file.pdf")
text = extract_text(doc, 0)
close!(doc)
catch e
if e isa PdfOxideError
@warn "PDF error" code=e.code op=e.op
else
rethrow()
end
end
PdfOxideError <: Exception carries the numeric C-ABI code and the failing operation name.
Complete Example
using PdfOxide
# --- Creation ---
doc_bytes = let b = DocumentBuilder()
set_title(b, "Report")
p = letter_page(b)
font(p, "Helvetica", 18)
at(p, 72, 720)
heading(p, 1, "Quarterly Report")
paragraph(p, "Generated by PDF Oxide.")
done(p)
out = build(b)
close!(b)
out
end
# --- Extraction ---
doc = open_from_bytes(doc_bytes)
println("Pages: ", page_count(doc))
for i in 0:page_count(doc)-1
println("Page ", i + 1, ": ", length(extract_text(doc, i)), " chars")
end
# Words + tables
words = extract_words(doc, 0)
tables = extract_tables(doc, 0)
# Search
for hit in search_all(doc, "report", false)
println("Found on page ", hit.page, " at ", hit.bbox)
end
close!(doc)
# --- Editing ---
e = open_editor("input.pdf")
rotate_all_pages(e, 90)
set_form_field_value(e, "name", "Jane Doe")
flatten_forms(e)
save(e, "output.pdf")
close!(e)
# --- Rendering ---
doc = open_document("input.pdf")
img = render_page(doc, 0) # PNG
save(img, "page0.png")
close!(img)
close!(doc)
Other Language Bindings
PDF Oxide ships native bindings for every major ecosystem: Rust, Python, Node.js, WASM, C#, Golang, Java, PHP, Ruby, C++, Swift, Kotlin, Dart, R, Zig, Scala, Clojure, Objective-C, and Elixir.
Next Steps