What is the fastest Python PDF library?

PDF Oxide is the fastest Python PDF library, with 0.8ms mean text extraction time — 5.8× faster than PyMuPDF (4.6ms) and 15× faster than pypdf (12.1ms). Benchmarked on 3,830 real-world PDFs with 100% pass rate.

Is PDF Oxide free for commercial use?

Yes. PDF Oxide is MIT licensed — free for all uses including commercial products, SaaS, and proprietary software. No license fees, no sales calls, no AGPL restrictions.

Can PDF Oxide handle scanned PDFs with OCR?

Yes. PDF Oxide includes built-in OCR via PaddleOCR and ONNX Runtime. No Tesseract installation needed — just pip install pdf_oxide and use extract_text_ocr(). Supports PP-OCRv3, v4, and v5 models.

Does PDF Oxide support XFA forms?

Yes. PDF Oxide is the only Python PDF library that can detect, analyze, and extract data from XFA forms (XML Forms Architecture). PyMuPDF, pypdf, pdfplumber, and pdfminer cannot read XFA form data.

How does PDF Oxide compare to PyMuPDF?

PDF Oxide is 5.8× faster than PyMuPDF (0.8ms vs 4.6ms mean), has a 100% pass rate vs 99.3%, and is MIT licensed vs PyMuPDF's AGPL-3.0. PDF Oxide also has built-in Markdown/HTML output and XFA form support that PyMuPDF lacks.

Can PDF Oxide convert PDF to Markdown?

Yes. PDF Oxide has built-in PDF to Markdown conversion with heading detection, table preservation, and list formatting — ideal for LLM and RAG pipelines. No separate package needed, unlike PyMuPDF which requires pymupdf4llm (69× slower).

Julia API リファレンス

PDF Oxide はイディオマティックな Julia バインディング（PdfOxide.jl）を提供しており、pdf_oxide の C ABI の上に ccall 経由で直接実装されています — シム用クレートは挟みません。ネイティブハンドルはファイナライザ付きの mutable struct でラップされ、C 文字列やバッファは Julia 側へコピーされたのち自動的に解放され、C ABI が成功以外のステータスを返した場合は必ず PdfOxideError が送出されます。ページインデックスは 0 始まりです。

using Pkg
Pkg.add("PdfOxide")

ネイティブの libpdf_oxide cdylib は実行時に読み込まれ、PDF_OXIDE_LIB_PATH → PDF_OXIDE_LIB_DIR → ../target/release → target/release → システムローダーの順に解決されます。バインディング用のフィーチャーセットを指定してビルドします。

cargo build --release --lib --features ocr,rendering,signatures,barcodes,tsa-client,system-fonts

using PdfOxide

pdf = from_markdown("# Hello\n\nbody\n")
doc = open_from_bytes(to_bytes(pdf))
page_count(doc)
extract_text(doc, 0)        # 0-based page index
to_markdown_all(doc)
close!(doc)

Rust API については Rust API リファレンスを、Python API については Python API リファレンスを参照してください。型の詳細は型と列挙型を参照してください。

ほとんどのハンドル型（PdfDocument、Pdf、DocumentEditor、DocumentBuilder、RenderedImage、Certificate、SignatureInfo、Timestamp、TsaClient、Dss、各種検証結果、Barcode、OcrEngine、Renderer、ElementList、EmbeddedFont、PageBuilder）はネイティブメモリを所有しています。close!(x) で速やかに解放してください — この呼び出しは冪等であり、ファイナライズ時にも自動的に実行されます。

PdfDocument

既存の PDF を開いて、抽出・検索・レンダリング・検証・調査を行うための読み取り側ハンドルです。

開く

open_document(path::AbstractString; password::Union{Nothing,AbstractString} = nothing) -> PdfDocument
open_from_bytes(data::AbstractVector{UInt8}) -> PdfDocument
open_with_password(path::AbstractString, password::AbstractString) -> PdfDocument

関数	説明
`open_document(path; password=nothing)`	ファイルシステム上のパスから PDF を開く（パスワード保護にも対応）。
`open_from_bytes(data)`	メモリ上のバイトベクターから PDF を開く。
`open_with_password(path, password)`	ディスク上の暗号化された PDF をパスワード付きで開く。

ドキュメント情報

関数	戻り値	説明
`page_count(d)`	`Int`	ページ数。
`version(d)`	`PdfVersion`	PDF バージョン（`major`/`minor`）。
`is_encrypted(d)`	`Bool`	ドキュメントが暗号化されているかどうか。
`has_structure_tree(d)`	`Bool`	ドキュメントが構造ツリーを持つタグ付き PDF かどうか。
`has_xfa(d)`	`Bool`	ドキュメントが XFA フォームを保持しているかどうか。
`authenticate(d, password)`	`Bool`	暗号化されたドキュメントのパスワードを検証する（誤ったパスワードはエラーではなく `false` を返す）。

ドキュメント全体の変換

関数	戻り値	説明
`to_markdown_all(d)`	`String`	ドキュメント全体の Markdown。
`to_html_all(d)`	`String`	ドキュメント全体の HTML。
`to_plain_text_all(d)`	`String`	ドキュメント全体のプレーンテキスト。
`extract_all_text(d)`	`String`	ドキュメント全体の自動テキスト抽出。
`classify_document(d)`	`String`	ドキュメント全体を分類し、分類器の JSON 文字列を返す。

ページ単位のテキスト抽出

extract_text(d::PdfDocument, page::Integer) -> String
to_plain_text(d::PdfDocument, page::Integer) -> String
to_markdown(d::PdfDocument, page::Integer) -> String
to_html(d::PdfDocument, page::Integer) -> String
extract_structured_json(d::PdfDocument, page::Integer) -> String

関数	説明
`extract_text(d, page)`	0 始まりのページからプレーンテキストを抽出する。
`to_plain_text(d, page)`	ページをレイアウトを考慮したプレーンテキストとしてレンダリングする。
`to_markdown(d, page)`	ページを Markdown としてレンダリングする。
`to_html(d, page)`	ページを HTML としてレンダリングする。
`extract_structured_json(d, page)`	ページの構造化ドキュメント JSON モデルを抽出する。
`extract_text_auto(d, page)`	ページに最適なテキスト抽出方式を自動選択する。
`extract_page_auto(d, page, options="{}")`	JSON オプション文字列を伴う自動ページ抽出。
`classify_page(d, page)`	ページを分類し、分類器の JSON 文字列を返す。

構造化要素の抽出

extract_chars(d::PdfDocument, page::Integer)      -> Vector{Char}
extract_words(d::PdfDocument, page::Integer)      -> Vector{Word}
extract_text_lines(d::PdfDocument, page::Integer) -> Vector{TextLine}
extract_tables(d::PdfDocument, page::Integer)     -> Vector{Table}
embedded_fonts(d::PdfDocument, page::Integer)     -> Vector{Font}
embedded_images(d::PdfDocument, page::Integer)    -> Vector{Image}
page_annotations(d::PdfDocument, page::Integer)   -> Vector{Annotation}
extract_paths(d::PdfDocument, page::Integer)      -> Vector{Path}

関数	説明
`extract_chars(d, page)`	ページからグリフを `Vector{Char}` として抽出する。
`extract_words(d, page)`	ページから単語を `Vector{Word}` として抽出する。
`extract_text_lines(d, page)`	テキスト行を `Vector{TextLine}` として抽出する。
`extract_tables(d, page)`	テーブルを `Vector{Table}` として抽出する。
`embedded_fonts(d, page)`	埋め込みフォントを `Vector{Font}` として取得する。
`embedded_images(d, page)`	埋め込み画像を `Vector{Image}` として取得する。
`page_annotations(d, page)`	注釈を `Vector{Annotation}` として取得する。
`extract_paths(d, page)`	ベクターパスを `Vector{Path}` として取得する。

領域を限定した抽出

extract_text_in_rect(d::PdfDocument, page::Integer, x, y, w, h)   -> String
extract_words_in_rect(d::PdfDocument, page::Integer, x, y, w, h)  -> Vector{Word}
extract_lines_in_rect(d::PdfDocument, page::Integer, x, y, w, h)  -> Vector{TextLine}
extract_tables_in_rect(d::PdfDocument, page::Integer, x, y, w, h) -> Vector{Table}
extract_images_in_rect(d::PdfDocument, page::Integer, x, y, w, h) -> Vector{Image}

いずれも、0 始まりのページ上で PDF ユーザー空間ポイント単位の矩形 (x, y, w, h) に抽出範囲を限定します。

検索

search(d::PdfDocument, page::Integer, term::AbstractString, caseSensitive::Bool) -> Vector{SearchResult}
search_all(d::PdfDocument, term::AbstractString, caseSensitive::Bool)             -> Vector{SearchResult}
search_results_to_json(d::PdfDocument, term::AbstractString, caseSensitive::Bool) -> String

関数	説明
`search(d, page, term, caseSensitive)`	単一ページ内で `term` を検索する。
`search_all(d, term, caseSensitive)`	ドキュメント全体で `term` を検索する。
`search_results_to_json(d, term, caseSensitive)`	ドキュメント全体の検索結果を JSON にシリアライズする。

ページジオメトリ

関数	戻り値	説明
`page_get_width(d, page)`	`Float64`	ページ幅（PDF ポイント単位）。
`page_get_height(d, page)`	`Float64`	ページ高さ（PDF ポイント単位）。
`page_get_rotation(d, page)`	`Int`	ページの絶対回転角（度）。
`page(d, index)`	`PdfPage`	ドキュメント上の 0 始まりのページビュー（PdfPage を参照）。

ドキュメント構造とメタデータ

関数	戻り値	説明
`get_outline(d)`	`String`	ブックマーク／目次（JSON）。
`get_page_labels(d)`	`String`	ページラベル範囲（JSON）。
`get_xmp_metadata(d)`	`String`	XMP メタデータ（JSON）。
`get_source_bytes(d)`	`Vector{UInt8}`	ドキュメントの元のソースバイト列。
`plan_split_by_bookmarks(d, options="{}")`	`String`	ブックマーク単位の分割操作を計画する（JSON）。

注釈の調査

annotations_to_json(d::PdfDocument, page::Integer) -> String
annotation_get_color(d::PdfDocument, page::Integer, index::Integer) -> UInt32
annotation_creation_date(d::PdfDocument, page::Integer, index::Integer) -> String
annotation_modification_date(d::PdfDocument, page::Integer, index::Integer) -> String

関数	戻り値	説明
`annotations_to_json(d, page)`	`String`	ページ上の全注釈を JSON として取得する。
`annotation_get_color(d, page, index)`	`UInt32`	注釈の 32 ビット圧縮 RGBA カラー。
`annotation_creation_date(d, page, index)`	`String`	注釈の作成日時。
`annotation_modification_date(d, page, index)`	`String`	注釈の更新日時。
`annotation_is_hidden(d, page, index)`	`Bool`	注釈が非表示かどうか。
`annotation_is_marked_deleted(d, page, index)`	`Bool`	注釈が削除マーク済みかどうか。
`annotation_is_printable(d, page, index)`	`Bool`	注釈が印刷可能かどうか。
`annotation_is_read_only(d, page, index)`	`Bool`	注釈が読み取り専用かどうか。
`link_annotation_uri(d, page, index)`	`String`	Link 注釈の URI。
`text_annotation_icon_name(d, page, index)`	`String`	Text（付箋）注釈のアイコン名。
`highlight_quad_points_count(d, page, index)`	`Int`	Highlight 注釈の四角形ポイント数。
`highlight_quad_point(d, page, index, quad_index)`	`NTuple{8,Float64}`	`quad_index` 番目の四角形（8 個の浮動小数点数）。

フォント・要素 JSON ヘルパー

関数	戻り値	説明
`fonts_to_json(d, page)`	`String`	ページ上の埋め込みフォントを JSON として取得する。
`font_size(d, page, index)`	`Float64`	`index` 番目の埋め込みフォントのサイズ。
`page_get_elements(d, page)`	`ElementList`	ページのレイアウト領域要素を `ElementList` として取得する。

フォームフィールド

get_form_fields(d::PdfDocument) -> Vector{FormField}
form_field_count(d::PdfDocument) -> Int
export_form_data_to_bytes(d::PdfDocument, format_type::Integer) -> Vector{UInt8}
import_form_data(d::PdfDocument, path::AbstractString) -> Int
form_import_from_file(d::PdfDocument, filename::AbstractString) -> Bool

関数	説明
`get_form_fields(d)`	すべての AcroForm フィールドを `Vector{FormField}` として取得する。
`form_field_count(d)`	フィールド数を取得する簡易関数。
`export_form_data_to_bytes(d, format_type)`	フォームデータをエクスポートする（`format_type` で FDF/XFDF を選択）。
`import_form_data(d, path)`	ファイルからフォームデータをインポートし、C のステータスコードを返す。
`form_import_from_file(d, filename)`	フォームデータをインポートし、成功時に `true` を返す。

OCR

page_needs_ocr(d::PdfDocument, page::Integer) -> Bool
ocr_extract_text(d::PdfDocument, page::Integer, engine::Union{Nothing,OcrEngine} = nothing) -> String

関数	説明
`page_needs_ocr(d, page)`	ページがスキャン／ハイブリッドであり OCR が必要かどうか。
`ocr_extract_text(d, page, engine=nothing)`	OCR 経由でテキストを抽出する（`nothing` の場合はネイティブ抽出のみにフォールバックする）。OcrEngine を参照。

署名（ドキュメントレベル）

sign(d::PdfDocument, cert::Certificate; reason::AbstractString = "", location::AbstractString = "") -> Int
get_signature_count(d::PdfDocument) -> Int
get_signature(d::PdfDocument, index::Integer) -> SignatureInfo
verify_all_signatures(d::PdfDocument) -> Int
has_timestamp(d::PdfDocument) -> Int
document_get_dss(d::PdfDocument) -> Union{Dss,Nothing}

関数	説明
`sign(d, cert; reason, location)`	`cert` でドキュメントに署名し、ステータスコードを返す。
`get_signature_count(d)`	ドキュメント内の署名数。
`get_signature(d, index)`	`index` 番目の署名を `SignatureInfo` として取得する。
`verify_all_signatures(d)`	すべての署名を検証し、ステータスコードを返す。
`has_timestamp(d)`	ドキュメントレベルのタイムスタンプを持っているかどうか。
`document_get_dss(d)`	ドキュメントの `/DSS` を `Dss` として読み取る（存在しない場合は `nothing`）。

検証と変換

validate_pdf_a(d::PdfDocument, level::Integer) -> PdfAResults
validate_pdf_ua(d::PdfDocument, level::Integer) -> UaResults
validate_pdf_x(d::PdfDocument, level::Integer) -> PdfXResults
document_convert_to_pdf_a(d::PdfDocument, level::Integer) -> Bool

validatePdfA / validatePdfUa / validatePdfX は 3 つの検証関数の camelCase エイリアスです。検証結果を参照してください。

Office 変換

to_docx(d::PdfDocument) -> Vector{UInt8}
to_pptx(d::PdfDocument) -> Vector{UInt8}
to_xlsx(d::PdfDocument) -> Vector{UInt8}

PDF を DOCX / PPTX / XLSX のバイトバッファへ変換します。Office ファイルを PDF として開くには open_from_docx_bytes / open_from_pptx_bytes / open_from_xlsx_bytes を使用してください（Office 入力を参照）。

ライフサイクル

関数	説明
`close!(d)`	ネイティブハンドルを即座に解放する（冪等であり、ファイナライズ時にも実行される）。

PdfPage

page(doc, index) から返される軽量な 0 始まりのページビューです。メソッドは親ドキュメントへ委譲されます。

p = page(doc, 0)
text(p)                       # -> String
markdown(p)                   # -> String
extract_words(p)              # -> Vector{Word}
search(p, "term", false)      # -> Vector{SearchResult}
render_page(p)                # -> RenderedImage

メソッド	戻り値	説明
`text(p)`	`String`	ページのプレーンテキスト。
`markdown(p)`	`String`	ページの Markdown。
`html(p)`	`String`	ページの HTML。
`plain_text(p)`	`String`	レイアウトを考慮したプレーンテキスト。
`extract_chars(p)`	`Vector{Char}`	グリフ。
`extract_words(p)`	`Vector{Word}`	単語。
`extract_text_lines(p)`	`Vector{TextLine}`	テキスト行。
`extract_tables(p)`	`Vector{Table}`	テーブル。
`embedded_fonts(p)`	`Vector{Font}`	埋め込みフォント。
`embedded_images(p)`	`Vector{Image}`	埋め込み画像。
`page_annotations(p)`	`Vector{Annotation}`	注釈。
`extract_paths(p)`	`Vector{Path}`	ベクターパス。
`search(p, term, caseSensitive)`	`Vector{SearchResult}`	ページを検索する。
`render_page(p, format=0)`	`RenderedImage`	ページをレンダリングする（`0`=PNG）。
`render_page_zoom(p, zoom, format=0)`	`RenderedImage`	ズーム倍率を指定してレンダリングする。
`render_page_thumbnail(p, size, format=0)`	`RenderedImage`	`size` px に収まるサムネイルをレンダリングする。

レンダリング

PdfDocument のページを画像にレンダリングします。format: 0=PNG（デフォルト）、1=JPEG。座標は PDF ユーザー空間ポイント単位です。

render_page(d::PdfDocument, pageIndex::Integer, format::Integer = 0) -> RenderedImage
render_page_zoom(d::PdfDocument, pageIndex::Integer, zoom::Real, format::Integer = 0) -> RenderedImage
render_page_thumbnail(d::PdfDocument, pageIndex::Integer, size::Integer, format::Integer = 0) -> RenderedImage
render_page_region(d::PdfDocument, page::Integer, crop_x, crop_y, crop_width, crop_height, format::Integer = 0) -> RenderedImage
render_page_fit(d::PdfDocument, page::Integer, w::Integer, h::Integer, format::Integer = 0) -> RenderedImage
render_page_raw(d::PdfDocument, page::Integer, dpi::Integer) -> Tuple{RenderedImage,Int,Int}
render_page_with_options(d::PdfDocument, page::Integer, dpi, format, bg_r, bg_g, bg_b, bg_a, transparent_background, render_annotations, jpeg_quality) -> RenderedImage
render_page_with_options_ex(d::PdfDocument, page::Integer, dpi, format, bg_r, bg_g, bg_b, bg_a, transparent_background, render_annotations, jpeg_quality, excluded_layers::AbstractVector{<:AbstractString} = String[]) -> RenderedImage

関数	説明
`render_page(d, page, format=0)`	ページをレンダリングする（`0`=PNG）。
`render_page_zoom(d, page, zoom, format=0)`	ズーム倍率を指定してレンダリングする。
`render_page_thumbnail(d, page, size, format=0)`	`size` px に収まるサムネイルをレンダリングする。
`render_page_region(d, page, x, y, w, h, format=0)`	矩形の切り出し（ユーザー空間ポイント単位）をレンダリングする。
`render_page_fit(d, page, w, h, format=0)`	アスペクト比を保ったまま `w`×`h` px に収まるようレンダリングする。
`render_page_raw(d, page, dpi)`	プリマルチプライ済みの RGBA8888 生バッファへレンダリングし、`(image, width, height)` を返す。
`render_page_with_options(d, page, dpi, format, bg_r, bg_g, bg_b, bg_a, transparent_background, render_annotations, jpeg_quality)`	RenderOptions のフルセットを指定してレンダリングする（背景チャンネルは 0〜1、フラグは 0/1）。
`render_page_with_options_ex(...; excluded_layers)`	上記に加えて OCG レイヤーのフィルタリングが可能 — 指定した `/Name` レイヤーを除外する。
`estimate_render_time(d, page)`	ページのレンダリング時間（ミリ秒）を見積もる。

renderPage / renderPageZoom / renderPageThumbnail は render_page / render_page_zoom / render_page_thumbnail の camelCase エイリアスです。

Renderer

再利用可能なレンダラーハンドルです。

create_renderer(dpi::Integer = 150, format::Integer = 0, quality::Integer = 90, anti_alias::Bool = true) -> Renderer
close!(r::Renderer)

RenderedImage

レンダリング呼び出しの結果です。フィールド: width::Int、height::Int、data::Vector{UInt8}（エンコード済みバイト列。render_page_raw の場合は生の RGBA）。

save(img::RenderedImage, path::AbstractString)   # write to disk (format inferred)
close!(img::RenderedImage)

DocumentEditor

既存の PDF を変更し、再保存するための編集側ハンドルです。

開く・ソース

open_editor(path::AbstractString) -> DocumentEditor
open_editor_from_bytes(data::AbstractVector{UInt8}) -> DocumentEditor

関数	戻り値	説明
`is_modified(e)`	`Bool`	エディタに未保存の変更があるかどうか。
`get_source_path(e)`	`String`	エディタが開かれたソースパス。
`page_count(e)`	`Int`	ページ数。
`version(e)`	`PdfVersion`	PDF バージョン。

情報・メタデータ

関数	説明
`get_producer(e)`	`/Info.Producer` のプロデューサー情報を取得する。
`set_producer(e, value)`	`/Info.Producer` を設定する。
`get_creation_date(e)`	`/Info.CreationDate` の作成日（生の PDF 日付文字列）を取得する。
`set_creation_date(e, date_str)`	`/Info.CreationDate` を設定する（生の PDF 日付文字列）。

保存

save(e::DocumentEditor, path::AbstractString)
save_to_bytes(e::DocumentEditor) -> Vector{UInt8}
save_to_bytes_with_options(e::DocumentEditor, compress::Bool, garbage_collect::Bool, linearize::Bool) -> Vector{UInt8}
extract_pages_to_bytes(e::DocumentEditor, pages::AbstractVector{<:Integer}) -> Vector{UInt8}
save_encrypted(e::DocumentEditor, path::AbstractString, user_password::AbstractString, owner_password::AbstractString)
save_encrypted_to_bytes(e::DocumentEditor, user_password::AbstractString, owner_password::AbstractString) -> Vector{UInt8}
convert_to_pdf_a(e::DocumentEditor, level::Integer)

関数	説明
`save(e, path)`	編集後のドキュメントをパスへ保存する。
`save_to_bytes(e)`	編集後のドキュメントをバイト列にシリアライズする。
`save_to_bytes_with_options(e, compress, garbage_collect, linearize)`	圧縮／GC／リニアライズのオプションを指定してシリアライズする。
`extract_pages_to_bytes(e, pages)`	ページの一部を新しいインメモリ PDF として抽出する。
`save_encrypted(e, path, user_pw, owner_pw)`	AES-256 暗号化を施してパスへ保存する。
`save_encrypted_to_bytes(e, user_pw, owner_pw)`	AES-256 暗号化を施してバイト列に保存する。
`convert_to_pdf_a(e, level)`	その場で PDF/A に変換する（`level`: 0=A1b…7=A3u）。

マージと添付ファイル

関数	説明
`merge_from(e, source_path)`	ディスク上の PDF からページをマージする。
`merge_from_bytes(e, data)`	メモリ上の PDF バッファからページをマージする。
`embed_file(e, name, data)`	ファイル添付を埋め込む（`name`、`data` バイト列）。

ページ操作

rotate_all_pages(e::DocumentEditor, degrees::Integer)
rotate_page_by(e::DocumentEditor, page::Integer, degrees::Integer)
get_page_rotation(e::DocumentEditor, page::Integer) -> Int
set_page_rotation(e::DocumentEditor, page::Integer, degrees::Integer)
delete_page(e::DocumentEditor, page::Integer)
move_page(e::DocumentEditor, from::Integer, to::Integer)

関数	説明
`rotate_all_pages(e, degrees)`	すべてのページを回転する（相対値）。
`rotate_page_by(e, page, degrees)`	ページを加算的に回転する。
`get_page_rotation(e, page)`	ページの絶対回転角を取得する。
`set_page_rotation(e, page, degrees)`	ページの絶対回転角を設定する。
`delete_page(e, page)`	ページを削除する。
`move_page(e, from, to)`	ページをあるインデックスから別のインデックスへ移動する。

ページボックスとクロップ

get_page_media_box(e::DocumentEditor, page::Integer) -> NTuple{4,Float64}
get_page_crop_box(e::DocumentEditor, page::Integer) -> NTuple{4,Float64}
set_page_media_box(e::DocumentEditor, page::Integer, x, y, w, h)
set_page_crop_box(e::DocumentEditor, page::Integer, x, y, w, h)
crop_margins(e::DocumentEditor, left::Real, right::Real, top::Real, bottom::Real)

関数	説明
`get_page_media_box(e, page)`	ページの MediaBox を取得する。
`get_page_crop_box(e, page)`	ページの CropBox を取得する。
`set_page_media_box(e, page, x, y, w, h)`	ページの MediaBox を設定する。
`set_page_crop_box(e, page, x, y, w, h)`	ページの CropBox を設定する。
`crop_margins(e, left, right, top, bottom)`	すべてのページを余白でクロップする（ユーザー空間単位）。

消去・ホワイトアウト

erase_region(e::DocumentEditor, page::Integer, x, y, w, h)
erase_regions(e::DocumentEditor, page::Integer, rects::AbstractVector{<:NTuple{4,<:Real}})
clear_erase_regions(e::DocumentEditor, page::Integer)

関数	説明
`erase_region(e, page, x, y, w, h)`	1 つの矩形領域を消去する。
`erase_regions(e, page, rects)`	複数の領域を消去する（`rects` は `(x, y, w, h)` タプルのベクター）。
`clear_erase_regions(e, page)`	ページに対する保留中の消去領域エントリをすべてクリアする。

ヘッダー・フッター・アーティファクトの除去

erase_header(d::PdfDocument, page::Integer)
erase_footer(d::PdfDocument, page::Integer)
erase_artifacts(d::PdfDocument, page::Integer)
remove_headers(d::PdfDocument, threshold::Real = 0.5)
remove_footers(d::PdfDocument, threshold::Real = 0.5)
remove_artifacts(d::PdfDocument, threshold::Real = 0.5)

関数	説明
`erase_header(d, page)` / `erase_footer(d, page)` / `erase_artifacts(d, page)`	ページ上で検出されたヘッダー／フッター／アーティファクトを消去する。
`remove_headers(d, threshold=0.5)` / `remove_footers(...)` / `remove_artifacts(...)`	頻度しきい値を超える反復ヘッダー／フッター／アーティファクトをドキュメント全体から除去する。

注釈のフラット化

flatten_annotations(e::DocumentEditor, page::Integer)
flatten_all_annotations(e::DocumentEditor)
is_page_marked_for_flatten(e::DocumentEditor, page::Integer) -> Bool
unmark_page_for_flatten(e::DocumentEditor, page::Integer)

黒塗り（レダクション）

apply_page_redactions(e::DocumentEditor, page::Integer)
apply_all_redactions(e::DocumentEditor)
is_page_marked_for_redaction(e::DocumentEditor, page::Integer) -> Bool
unmark_page_for_redaction(e::DocumentEditor, page::Integer)
redaction_add(e::DocumentEditor, page::Integer, x1, y1, x2, y2, r, g, b)
redaction_count(e::DocumentEditor, page::Integer) -> Int
redaction_apply(e::DocumentEditor, scrub_metadata::Bool, r::Real, g::Real, b::Real) -> Int
redaction_scrub_metadata(e::DocumentEditor) -> Int

関数	説明
`apply_page_redactions(e, page)`	ページ上のレダクションを適用（焼き込み）する。
`apply_all_redactions(e)`	保留中のレダクションをすべて適用する。
`is_page_marked_for_redaction(e, page)`	ページがレダクション対象としてマークされているかどうか。
`unmark_page_for_redaction(e, page)`	ページからレダクションマークを除去する。
`redaction_add(e, page, x1, y1, x2, y2, r, g, b)`	オーバーレイカラー（DeviceRGB、0〜1）付きのレダクションボックスをキューに追加する。
`redaction_count(e, page)`	ページ上にキューイングされたレダクション領域の数。
`redaction_apply(e, scrub_metadata, r, g, b)`	キューイングされたレダクションをすべて破壊的に適用し、削除したグリフ数を返す。
`redaction_scrub_metadata(e)`	Info／XMP／JavaScript／添付ファイルを除去し、除去件数を返す。

フォーム入力とフラット化

set_form_field_value(e::DocumentEditor, name::AbstractString, value::AbstractString)
flatten_forms(e::DocumentEditor)
flatten_forms_on_page(e::DocumentEditor, page::Integer)
flatten_warnings_count(e::DocumentEditor) -> Int
flatten_warning(e::DocumentEditor, index::Integer) -> String
import_fdf_bytes(e::DocumentEditor, data::AbstractVector{UInt8})
import_xfdf_bytes(e::DocumentEditor, data::AbstractVector{UInt8})

関数	説明
`set_form_field_value(e, name, value)`	フォームフィールドの値を設定する（UTF-8）。
`flatten_forms(e)`	すべてのフォームをフラット化する（値をページコンテンツに焼き込む）。
`flatten_forms_on_page(e, page)`	単一ページ上のフォームをフラット化する。
`flatten_warnings_count(e)`	直近のフォームフラット化保存で発生した警告数。
`flatten_warning(e, index)`	`index` 番目のフラット化警告文字列。
`import_fdf_bytes(e, data)`	バイト列から FDF フォームデータをインポートする。
`import_xfdf_bytes(e, data)`	バイト列から XFDF フォームデータをインポートする。

バーコードのスタンプ

add_barcode_to_page(e::DocumentEditor, page::Integer, b::Barcode, x, y, width, height)

Barcode をページ上の矩形 (x, y, width, height) にスタンプします。バーコードを参照してください。

ライフサイクル

close!(e::DocumentEditor) はハンドルを解放します。

Pdf

from_* ファクトリ関数が返す軽量な作成側ハンドルです。

ファクトリ

from_markdown(input::AbstractString) -> Pdf
from_html(input::AbstractString) -> Pdf
from_text(input::AbstractString) -> Pdf
from_image(path::AbstractString) -> Pdf
from_image_bytes(data::AbstractVector{UInt8}) -> Pdf
from_html_css(html::AbstractString, css::AbstractString, font_bytes::Union{Nothing,AbstractVector{UInt8}} = nothing) -> Pdf
from_html_css_with_fonts(html::AbstractString, css::AbstractString, families::AbstractVector{<:AbstractString}, fonts::AbstractVector{<:AbstractVector{UInt8}}) -> Pdf

関数	説明
`from_markdown(input)`	Markdown から `Pdf` を構築する。
`from_html(input)`	HTML から `Pdf` を構築する。
`from_text(input)`	プレーンテキストから `Pdf` を構築する。
`from_image(path)`	画像ファイルから `Pdf` を構築する。
`from_image_bytes(data)`	メモリ上の画像バイト列から `Pdf` を構築する。
`from_html_css(html, css, font_bytes=nothing)`	HTML + CSS に加えて、任意で 1 つの埋め込みフォントを指定して構築する。
`from_html_css_with_fonts(html, css, families, fonts)`	HTML + CSS にマルチフォントカスケードを指定して構築する（`families[i]` が `fonts[i]` の名前になる）。

メソッド

関数	戻り値	説明
`save(p, path)`	—	構築した PDF をパスへ書き出す。
`to_bytes(p)`	`Vector{UInt8}`	構築した PDF をバイト列にシリアライズする。
`get_page_count(p)`	`Int`	ビルダー `Pdf` のページ数。
`close!(p)`	—	ハンドルを解放する。

merge_pdfs

merge_pdfs(paths::AbstractVector{<:AbstractString}) -> Vector{UInt8}

paths にある PDF を（指定順に）1 つの PDF バイトバッファへマージします。

Office 入力

open_from_docx_bytes(data::AbstractVector{UInt8}) -> Pdf
open_from_pptx_bytes(data::AbstractVector{UInt8}) -> Pdf
open_from_xlsx_bytes(data::AbstractVector{UInt8}) -> Pdf

DOCX / PPTX / XLSX のバイト列を Pdf に変換します。

DocumentBuilder

構造を意識した、流れるような PDF ビルダーです。a4_page / letter_page / page でページを作成し、返ってきた PageBuilder にコンテンツを配置し、各ページごとに done を呼んでコミットしたうえで、最後に build / save あるいは暗号化バリアントを呼び出します。

DocumentBuilder() -> DocumentBuilder

ドキュメントレベルの設定

set_title(b::DocumentBuilder, value::AbstractString)
set_author(b::DocumentBuilder, value::AbstractString)
set_subject(b::DocumentBuilder, value::AbstractString)
set_keywords(b::DocumentBuilder, value::AbstractString)
set_creator(b::DocumentBuilder, value::AbstractString)
on_open(b::DocumentBuilder, value::AbstractString)
language(b::DocumentBuilder, value::AbstractString)
tagged_pdf_ua1(b::DocumentBuilder)
role_map(b::DocumentBuilder, custom::AbstractString, standard::AbstractString)
register_embedded_font(b::DocumentBuilder, name::AbstractString, f::EmbeddedFont)

関数	説明
`set_title` / `set_author` / `set_subject` / `set_keywords` / `set_creator`	対応する `/Info` メタデータフィールドを設定する。
`on_open(b, value)`	ドキュメントオープン時の JavaScript アクションを設定する。
`language(b, value)`	ドキュメント言語を設定する（例: `"en-US"`）。
`tagged_pdf_ua1(b)`	PDF/UA-1 タグ付き PDF モードを有効化する。
`role_map(b, custom, standard)`	カスタム構造タイプを標準のものへマッピングする。
`register_embedded_font(b, name, f)`	読み込み済みの `EmbeddedFont` を `name` で登録する（フォントを消費する）。

ページ

a4_page(b::DocumentBuilder) -> PageBuilder
letter_page(b::DocumentBuilder) -> PageBuilder
page(b::DocumentBuilder, width::Real, height::Real) -> PageBuilder

出力

build(b::DocumentBuilder) -> Vector{UInt8}
save(b::DocumentBuilder, path::AbstractString)
save_encrypted_builder(b::DocumentBuilder, path::AbstractString, user_password::AbstractString, owner_password::AbstractString)
to_bytes_encrypted(b::DocumentBuilder, user_password::AbstractString, owner_password::AbstractString) -> Vector{UInt8}

関数	説明
`build(b)`	PDF バイト列を構築して返す（ビルダーは依然として閉じる必要がある）。
`save(b, path)`	構築してパスへ保存する。
`save_encrypted_builder(b, path, user_pw, owner_pw)`	構築して AES-256 暗号化で保存する。
`to_bytes_encrypted(b, user_pw, owner_pw)`	暗号化されたバイト列を構築する（AES-256）。

EmbeddedFont

embedded_font_from_file(path::AbstractString) -> EmbeddedFont
embedded_font_from_bytes(data::AbstractVector{UInt8}; name::Union{Nothing,AbstractString} = nothing) -> EmbeddedFont

関数	説明
`embedded_font_from_file(path)`	パスから TTF/OTF フォントを読み込む。
`embedded_font_from_bytes(data; name)`	バイト列からフォントを読み込む（`name` を空にすると PostScript 名が使われる）。

PageBuilder

a4_page / letter_page / page から返されます。すべてのメソッドはページをその場で変更し、done(p) を呼ぶことで親ビルダーへコミットされます（このハンドルは消費されます）。

テキストとレイアウト

font(p::PageBuilder, name::AbstractString, size::Real)
at(p::PageBuilder, x::Real, y::Real)
heading(p::PageBuilder, level::Integer, text::AbstractString)

メソッド	説明
`font(p, name, size)`	以降のテキストに使うフォントとサイズを設定する。
`at(p, x, y)`	カーソルを絶対座標 `(x, y)` へ移動する（左下基準のポイント単位）。
`heading(p, level, text)`	`level`（1〜6）の見出しを出力する。

以下の文字列を受け取るメソッドはすべて f(p::PageBuilder, value::AbstractString) の形を共有します。

メソッド	説明
`text(p, value)`	本文テキストの一区切りを出力する。
`paragraph(p, value)`	折り返し付きの段落を出力する。
`link_url(p, value)`	直前のテキストを URL リンクにする。
`link_named(p, value)`	直前のテキストを名前付き宛先へリンクする。
`link_javascript(p, value)`	直前のテキストに JavaScript アクションをアタッチする。
`on_open(p, value)`	ページオープン時の JavaScript アクションを設定する。
`on_close(p, value)`	ページクローズ時の JavaScript アクションを設定する。
`field_keystroke(p, value)` / `field_format(p, value)` / `field_validate(p, value)` / `field_calculate(p, value)`	AcroForm フィールドの JavaScript アクションをアタッチする。
`sticky_note(p, value)`	直前のコンテンツに付箋をアタッチする。
`watermark(p, value)`	テキストの透かしを追加する。
`stamp(p, value)`	スタンプ注釈を追加する。
`inline(p, value)`	インラインのテキストランを追加する。
`inline_bold(p, value)`	太字のインラインランを追加する。
`inline_italic(p, value)`	斜体のインラインランを追加する。

引数なしのレイアウトメソッド（いずれも f(p::PageBuilder)）。

メソッド	説明
`horizontal_rule(p)`	水平線を描画する。
`space(p)`	垂直方向のスペースを挿入する。
`newline(p)`	改行する。
`new_page_same_size(p)`	同じサイズの新しいページを開始する。
`watermark_confidential(p)`	「CONFIDENTIAL」の透かしを追加する。
`watermark_draft(p)`	「DRAFT」の透かしを追加する。

link_page(p::PageBuilder, page_index::Integer)
sticky_note_at(p::PageBuilder, x::Real, y::Real, text::AbstractString)
freetext(p::PageBuilder, x::Real, y::Real, w::Real, h::Real, text::AbstractString)
footnote(p::PageBuilder, ref_mark::AbstractString, note_text::AbstractString)
columns(p::PageBuilder, column_count::Integer, gap_pt::Real, text::AbstractString)
inline_color(p::PageBuilder, r::Real, g::Real, b_::Real, text::AbstractString)

メソッド	説明
`link_page(p, page_index)`	直前のテキストを内部のページインデックスへリンクする。
`sticky_note_at(p, x, y, text)`	独立した付箋を配置する。
`freetext(p, x, y, w, h, text)`	矩形内に自由配置のテキスト注釈を配置する。
`footnote(p, ref_mark, note_text)`	脚注を追加する（インラインの上付き参照とページ末尾の本文）。
`columns(p, column_count, gap_pt, text)`	テキストをバランスの取れた段組みでレイアウトする。
`inline_color(p, r, g, b_, text)`	RGB 色付きのインラインラン（チャンネルは 0〜1）を追加する。

マークアップ注釈

以下の 4 つはすべて f(p::PageBuilder, r::Real, g::Real, b_::Real) を共有し、RGB カラーで直前のテキストに適用されます。

メソッド	説明
`highlight(p, r, g, b)`	直前のテキストをハイライトする。
`underline(p, r, g, b)`	直前のテキストに下線を引く。
`strikeout(p, r, g, b)`	直前のテキストに取り消し線を引く。
`squiggly(p, r, g, b)`	直前のテキストに波線の下線を引く。

フォームウィジェット

text_field(p::PageBuilder, name, x, y, w, h; default_value::Union{Nothing,AbstractString} = nothing)
checkbox(p::PageBuilder, name, x, y, w, h, checked::Bool)
combo_box(p::PageBuilder, name, x, y, w, h, options::AbstractVector{<:AbstractString}; selected::Union{Nothing,AbstractString} = nothing)
radio_group(p::PageBuilder, name, values, xs, ys, ws, hs; selected::Union{Nothing,AbstractString} = nothing)
push_button(p::PageBuilder, name, x, y, w, h, caption::AbstractString)
signature_field(p::PageBuilder, name, x, y, w, h)

メソッド	説明
`text_field(p, name, x, y, w, h; default_value)`	1 行テキストフィールドを追加する。
`checkbox(p, name, x, y, w, h, checked)`	初期チェック状態を指定してチェックボックスを追加する。
`combo_box(p, name, x, y, w, h, options; selected)`	ドロップダウンのコンボボックスを追加する。
`radio_group(p, name, values, xs, ys, ws, hs; selected)`	ラジオグループを追加する（各ボタンを並列配列で記述する）。
`push_button(p, name, x, y, w, h, caption)`	クリック可能なプッシュボタンを追加する。
`signature_field(p, name, x, y, w, h)`	未署名の署名プレースホルダーを追加する。

画像とバーコード

image(p::PageBuilder, bytes::AbstractVector{UInt8}, x, y, w, h)
image_with_alt(p::PageBuilder, bytes::AbstractVector{UInt8}, x, y, w, h, alt_text::AbstractString)
image_artifact(p::PageBuilder, bytes::AbstractVector{UInt8}, x, y, w, h)
barcode_1d(p::PageBuilder, barcode_type::Integer, data::AbstractString, x, y, w, h)
barcode_qr(p::PageBuilder, data::AbstractString, x, y, size::Real)

メソッド	説明
`image(p, bytes, x, y, w, h)`	画像（生の JPEG/PNG/WebP）を埋め込む。
`image_with_alt(p, bytes, x, y, w, h, alt_text)`	アクセシビリティ用の代替テキスト付きで画像を埋め込む。
`image_artifact(p, bytes, x, y, w, h)`	装飾的な画像を `/Artifact` として埋め込む。
`barcode_1d(p, barcode_type, data, x, y, w, h)`	1 次元バーコードを配置する（`barcode_type`: 0=Code128…7=Codabar）。
`barcode_qr(p, data, x, y, size)`	正方形の QR コードを配置する。

ベクターグラフィックス

rect(p::PageBuilder, x, y, w, h)
filled_rect(p::PageBuilder, x, y, w, h, r, g, b_)
line(p::PageBuilder, x1, y1, x2, y2)
stroke_rect(p::PageBuilder, x, y, w, h, width, r, g, b_)
stroke_line(p::PageBuilder, x1, y1, x2, y2, width, r, g, b_)
stroke_rect_dashed(p::PageBuilder, x, y, w, h, width, r, g, b_, dash_array::AbstractVector{<:Real}, phase::Real)
stroke_line_dashed(p::PageBuilder, x1, y1, x2, y2, width, r, g, b_, dash_array::AbstractVector{<:Real}, phase::Real)
text_in_rect(p::PageBuilder, x, y, w, h, text::AbstractString, align::Integer)

メソッド	説明
`rect(p, x, y, w, h)`	1pt の黒いストロークの矩形アウトライン。
`filled_rect(p, x, y, w, h, r, g, b)`	RGB（0〜1）で塗りつぶした矩形。
`line(p, x1, y1, x2, y2)`	1pt の黒い線。
`stroke_rect(p, x, y, w, h, width, r, g, b)`	`width`pt、RGB のストローク矩形。
`stroke_line(p, x1, y1, x2, y2, width, r, g, b)`	`width`pt、RGB のストローク線。
`stroke_rect_dashed(...)`	破線のストローク矩形（`dash_array` は実線・空白の長さ、`phase` はオフセット）。
`stroke_line_dashed(...)`	破線のストローク線。
`text_in_rect(p, x, y, w, h, text, align)`	矩形内にテキストを描画する（`align`: 0=Left, 1=Center, 2=Right）。

テーブル

table(p::PageBuilder, n_columns::Integer, widths::AbstractVector{<:Real}, aligns::AbstractVector{<:Integer}, n_rows::Integer, cell_strings::AbstractMatrix{<:AbstractString}, has_header::Bool)
streaming_table_begin(p::PageBuilder, headers::AbstractVector{<:AbstractString}, widths::AbstractVector{<:Real}, aligns::AbstractVector{<:Integer}, repeat_header::Bool)
streaming_table_begin_v2(p::PageBuilder, headers, widths, aligns, repeat_header::Bool, mode::Integer, sample_rows::Integer, min_col_width_pt::Real, max_col_width_pt::Real, max_rowspan::Integer)
streaming_table_set_batch_size(p::PageBuilder, batch_size::Integer)
streaming_table_pending_row_count(p::PageBuilder) -> Int
streaming_table_batch_count(p::PageBuilder) -> Int
streaming_table_push_row(p::PageBuilder, cells::AbstractVector{<:AbstractString})
streaming_table_push_row_v2(p::PageBuilder, cells::AbstractVector{<:AbstractString}, rowspans::AbstractVector{<:Integer})
streaming_table_flush(p::PageBuilder)
streaming_table_finish(p::PageBuilder)

メソッド	説明
`table(p, n_columns, widths, aligns, n_rows, cell_strings, has_header)`	バッファリングされたテーブルを出力する（`aligns`: 0=Left/1=Center/2=Right、`cell_strings` は行優先の `n_rows × n_columns`）。
`streaming_table_begin(p, headers, widths, aligns, repeat_header)`	ストリーミングテーブルを開く（長さ `n_columns` の並列配列）。
`streaming_table_begin_v2(...)`	幅 `mode`（0=Fixed, 1=Sample, 2=AutoAll）と `max_rowspan` を指定してストリーミングテーブルを開く。
`streaming_table_set_batch_size(p, batch_size)`	バッチサイズを設定する（0 → 256）。
`streaming_table_pending_row_count(p)`	直近のバッチ境界以降にプッシュされた行数。
`streaming_table_batch_count(p)`	これまでに完了したバッチ数。
`streaming_table_push_row(p, cells)`	1 行をプッシュする（rowspan=1）。
`streaming_table_push_row_v2(p, cells, rowspans)`	セルごとの rowspan（2 以上のスパン）を指定して 1 行をプッシュする。
`streaming_table_flush(p)`	現在のバッチをフラッシュする。
`streaming_table_finish(p)`	ストリーミングテーブルを完了する。

コミット

done(p::PageBuilder)   # commit this page's buffered operations to its parent builder (consumes the handle)

値型

抽出によって返される immutable struct です。Bbox は x、y、width、height（PDF ユーザー空間単位）のフィールドを持ちます。

型	フィールド
`Bbox`	`x`, `y`, `width`, `height`（`Float64`）
`Char`	`character::UInt32`, `bbox::Bbox`, `font_name::String`, `font_size::Float64`
`Word`	`text`, `bbox`, `font_name`, `font_size`, `bold`
`TextLine`	`text`, `bbox`, `word_count`
`Table`	`row_count`, `col_count`, `has_header`, `cells`（0 始まりのセルには `cell(t, row, col)` を使う）
`Font`	`name`, `type`, `encoding`, `embedded`, `subset`
`Image`	`width`, `height`, `bitsPerComponent`, `format`, `colorspace`, `data`
`Annotation`	`type`, `subtype`, `content`, `author`, `rect::Bbox`, `borderWidth`
`Path`	`bbox::Bbox`, `strokeWidth`, `hasStroke`, `hasFill`, `operationCount`
`SearchResult`	`text`, `page`, `bbox::Bbox`
`FormField`	`name`, `value`, `type`, `readonly`, `required`
`PdfVersion`	`major::Int`, `minor::Int`

cell(t::Table, row::Integer, col::Integer) -> String

FormField アクセサ

form_field_name(f::FormField) -> String
form_field_value(f::FormField) -> String
form_field_type(f::FormField) -> String
form_field_is_readonly(f::FormField) -> Bool
form_field_is_required(f::FormField) -> Bool

ElementList

element_count(l::ElementList) -> Int
element_type(l::ElementList, index::Integer) -> String
element_text(l::ElementList, index::Integer) -> String
element_rect(l::ElementList, index::Integer) -> Bbox
elements_to_json(l::ElementList) -> String
close!(l::ElementList)

電子署名

Certificate

certificate_load_from_bytes(data::AbstractVector{UInt8}, password::AbstractString = "") -> Certificate
certificate_load_from_pem(cert_pem::AbstractString, key_pem::AbstractString) -> Certificate
certificate_get_subject(c::Certificate) -> String
certificate_get_issuer(c::Certificate) -> String
certificate_get_serial(c::Certificate) -> String
certificate_get_validity(c::Certificate) -> Tuple{Int,Int}
certificate_is_valid(c::Certificate) -> Bool

関数	説明
`certificate_load_from_bytes(data, password="")`	PKCS#12 / PFX バイト列から署名用クレデンシャルを読み込む。
`certificate_load_from_pem(cert_pem, key_pem)`	PEM エンコードされた証明書 + 秘密鍵から読み込む。
`certificate_get_subject(c)` / `certificate_get_issuer(c)` / `certificate_get_serial(c)`	サブジェクト／発行者／シリアル番号の文字列。
`certificate_get_validity(c)`	有効期間を `(not_before, not_after)` の Unix エポック秒として取得する。
`certificate_is_valid(c)`	証明書が現在有効かどうか。

署名

sign_bytes(pdf::AbstractVector{UInt8}, cert::Certificate, reason::AbstractString, location::AbstractString) -> Vector{UInt8}
sign_bytes_pades(pdf::AbstractVector{UInt8}, cert::Certificate, level::Integer, tsa_url::Union{Nothing,AbstractString}, reason::AbstractString, location::AbstractString; certs = Vector{UInt8}[], crls = Vector{UInt8}[], ocsps = Vector{UInt8}[]) -> Vector{UInt8}
sign_bytes_pades_opts(pdf::AbstractVector{UInt8}, cert::Certificate, level::Integer, tsa_url, reason, location; certs = Vector{UInt8}[], crls = Vector{UInt8}[], ocsps = Vector{UInt8}[]) -> Vector{UInt8}
add_timestamp(pdf_data::AbstractVector{UInt8}, sig_index::Integer, tsa_url::AbstractString) -> Vector{UInt8}

関数	説明
`sign_bytes(pdf, cert, reason, location)`	生の PDF バイト列に署名し、署名済み PDF を返す。
`sign_bytes_pades(pdf, cert, level, tsa_url, reason, location; certs, crls, ocsps)`	PAdES ベースライン署名（`level`: 0=B-B, 1=B-T, 2=B-LT。`level` が 1 以上の場合 `tsa_url` が必須）。
`sign_bytes_pades_opts(...)`	`sign_bytes_pades` の構造体オプション版（動作は同一で、内部で `PadesSignOptionsC` を構築する）。
`add_timestamp(pdf_data, sig_index, tsa_url)`	署名に RFC 3161 タイムスタンプを追加し、タイムスタンプ済みの PDF を返す。

SignatureInfo

signature_get_signer_name(s::SignatureInfo) -> String
signature_get_signing_reason(s::SignatureInfo) -> String
signature_get_signing_location(s::SignatureInfo) -> String
signature_get_signing_time(s::SignatureInfo) -> Int
signature_get_certificate(s::SignatureInfo) -> Certificate
signature_get_pades_level(s::SignatureInfo) -> Int
signature_has_timestamp(s::SignatureInfo) -> Bool
signature_get_timestamp(s::SignatureInfo) -> Timestamp
signature_add_timestamp(s::SignatureInfo, ts) -> Bool
signature_verify(s::SignatureInfo) -> Int
signature_verify_detached(s::SignatureInfo, pdf::AbstractVector{UInt8}) -> Int

関数	説明
`signature_get_signer_name(s)` / `_reason(s)` / `_location(s)`	署名者名／理由／場所。
`signature_get_signing_time(s)`	署名時刻（Unix エポック秒）。
`signature_get_certificate(s)`	署名者の `Certificate`（所有権あり）。
`signature_get_pades_level(s)`	PAdES レベルコード。
`signature_has_timestamp(s)`	埋め込みの RFC 3161 タイムスタンプを持っているかどうか。
`signature_get_timestamp(s)`	埋め込みの `Timestamp`（所有権あり）。
`signature_add_timestamp(s, ts)`	`Timestamp` をアタッチし、成功時に `true` を返す。
`signature_verify(s)`	署名者属性の暗号学的チェック（`1`=valid、`0`=invalid、`-1`=unknown）。
`signature_verify_detached(s, pdf)`	PDF 全体のバイト列に対するエンドツーエンド検証（`1`/`0`/`-1`）。

Timestamp

timestamp_parse(data::AbstractVector{UInt8}) -> Timestamp
timestamp_get_token(t::Timestamp) -> Vector{UInt8}
timestamp_get_message_imprint(t::Timestamp) -> Vector{UInt8}
timestamp_get_time(t::Timestamp) -> Int
timestamp_get_serial(t::Timestamp) -> String
timestamp_get_tsa_name(t::Timestamp) -> String
timestamp_get_policy_oid(t::Timestamp) -> String
timestamp_get_hash_algorithm(t::Timestamp) -> Int
timestamp_verify(t::Timestamp) -> Bool

関数	説明
`timestamp_parse(data)`	DER エンコードされた RFC 3161 TimeStampToken（または素の TSTInfo）を解析する。
`timestamp_get_token(t)`	生のトークンバイト列。
`timestamp_get_message_imprint(t)`	メッセージインプリントのダイジェストバイト列。
`timestamp_get_time(t)`	タイムスタンプ時刻（Unix エポック秒）。
`timestamp_get_serial(t)`	シリアル番号文字列。
`timestamp_get_tsa_name(t)`	TSA 名。
`timestamp_get_policy_oid(t)`	ポリシー OID。
`timestamp_get_hash_algorithm(t)`	ダイジェストアルゴリズムコード。
`timestamp_verify(t)`	トークンが検証に通るかどうか。

TsaClient

tsa_client_create(url::AbstractString; username = nothing, password = nothing, timeout::Integer = 30, hash_algo::Integer = 0, use_nonce::Bool = true, cert_req::Bool = true) -> TsaClient
tsa_request_timestamp(t::TsaClient, data::AbstractVector{UInt8}) -> Timestamp
tsa_request_timestamp_hash(t::TsaClient, hash::AbstractVector{UInt8}, hash_algo::Integer) -> Timestamp

関数	説明
`tsa_client_create(url; …)`	RFC 3161 TSA クライアントを作成する（Basic 認証、タイムアウト、ハッシュアルゴリズム、ノンス、証明書要求はいずれも任意）。
`tsa_request_timestamp(t, data)`	`data` に対してタイムスタンプを要求する（ネットワーク I/O）。
`tsa_request_timestamp_hash(t, hash, hash_algo)`	事前計算済みのダイジェストに対してタイムスタンプを要求する。

Dss（ドキュメントセキュリティストア）

dss_cert_count(d::Dss) -> Int
dss_crl_count(d::Dss) -> Int
dss_ocsp_count(d::Dss) -> Int
dss_vri_count(d::Dss) -> Int
dss_get_cert(d::Dss, index::Integer) -> Vector{UInt8}
dss_get_crl(d::Dss, index::Integer) -> Vector{UInt8}
dss_get_ocsp(d::Dss, index::Integer) -> Vector{UInt8}

関数	説明
`dss_cert_count(d)` / `dss_crl_count(d)` / `dss_ocsp_count(d)` / `dss_vri_count(d)`	証明書／CRL／OCSP レスポンス／VRI エントリの件数。
`dss_get_cert(d, index)` / `dss_get_crl(d, index)` / `dss_get_ocsp(d, index)`	`index` 番目の証明書／CRL／OCSP レスポンスのバイト列。

検証結果

validate_pdf_a / validate_pdf_ua / validate_pdf_x はそれぞれ PdfAResults / UaResults / PdfXResults を返します。

関数	戻り値	説明
`is_compliant(r::PdfAResults)`	`Bool`	ドキュメントが PDF/A に準拠しているかどうか。
`is_compliant(r::PdfXResults)`	`Bool`	ドキュメントが PDF/X に準拠しているかどうか。
`is_accessible(r::UaResults)`	`Bool`	ドキュメントが PDF/UA アクセシビリティ要件を満たしているかどうか。
`errors(r)`	`Vector{String}`	エラーメッセージ（`PdfAResults` / `UaResults` / `PdfXResults`）。
`warnings(r)`	`Vector{String}`	警告メッセージ。
`ua_stats(r::UaResults)`	`NamedTuple`	アクセシビリティ要素数 `(structure, images, tables, forms, annotations, pages)`。
`pdf_a_error_count(r)` / `pdf_a_warning_count(r)`	`Int`	PDF/A のエラー数／警告数。
`pdf_ua_error_count(r)` / `pdf_ua_warning_count(r)`	`Int`	PDF/UA のエラー数／警告数。
`pdf_x_error_count(r)`	`Int`	PDF/X のエラー数。

バーコード

generate_qr_code(data::AbstractString, error_correction::Integer = 0, size_px::Integer = 256) -> Barcode
generate_barcode(data::AbstractString, format::Integer = 0, size_px::Integer = 256) -> Barcode
barcode_get_data(b::Barcode) -> String
barcode_get_format(b::Barcode) -> Int
barcode_get_confidence(b::Barcode) -> Float64
barcode_get_image_png(b::Barcode, size_px::Integer = 256) -> Vector{UInt8}
barcode_get_svg(b::Barcode, size_px::Integer = 256) -> String

関数	説明
`generate_qr_code(data, error_correction=0, size_px=256)`	QR コードを生成する。
`generate_barcode(data, format=0, size_px=256)`	1D/2D バーコードを生成する。
`barcode_get_data(b)`	デコード／エンコードされたペイロード文字列。
`barcode_get_format(b)`	フォーマットコード。
`barcode_get_confidence(b)`	デコードの信頼度（0.0〜1.0）。
`barcode_get_image_png(b, size_px=256)`	PNG バイト列にレンダリングする。
`barcode_get_svg(b, size_px=256)`	SVG 文字列にレンダリングする。

add_barcode_to_page(e, page, b, x, y, width, height) を使えば、エディタのページにバーコードをスタンプできます。

OCR

OcrEngine

ocr_engine_create(det_model_path::AbstractString, rec_model_path::AbstractString, dict_path::AbstractString) -> OcrEngine

検出モデル・認識モデル・辞書ファイルのパスから OCR エンジンを作成します。ocr_extract_text(doc, page, engine) と page_needs_ocr(doc, page) で使用してください（PdfDocument › OCR を参照）。close!(o::OcrEngine) で解放できます。

モデルのプリフェッチ

prefetch_models(languages_csv::AbstractString) -> String
prefetch_available() -> Int
model_manifest() -> String

関数	説明
`prefetch_models(languages_csv)`	カンマ区切りの言語に対して OCR/レイアウトモデルをプリフェッチする。
`prefetch_available()`	モデルのプリフェッチが利用可能かどうか。
`model_manifest()`	バンドルされているモデルマニフェスト（JSON 文字列）。

グローバル設定と暗号

set_log_level(level::Integer)   # 0=Off 1=Error 2=Warn 3=Info 4=Debug 5=Trace
get_log_level() -> Int
set_max_ops_per_stream(limit::Integer) -> Int
set_preserve_unmapped_glyphs(preserve::Integer) -> Int

関数	説明
`set_log_level(level)`	グローバルなログレベルを設定する（0〜5）。
`get_log_level()`	現在のグローバルログレベルを取得する。
`set_max_ops_per_stream(limit)`	ストリームごとのコンテンツ操作数の上限を設定し、直前の上限値を返す。
`set_preserve_unmapped_glyphs(preserve)`	未マップグリフの保持を切り替え、直前の設定値を返す。

暗号ポリシーとインベントリ

crypto_active_provider() -> String
crypto_cbom() -> String
crypto_inventory() -> String
crypto_policy() -> String
crypto_set_policy(spec::AbstractString) -> Int
crypto_fips_available() -> Int
crypto_use_fips() -> Int

関数	説明
`crypto_active_provider()`	現在アクティブな暗号プロバイダ名。
`crypto_cbom()`	暗号部品表（Cryptographic Bill of Materials、JSON）。
`crypto_inventory()`	暗号アルゴリズムのインベントリ（JSON）。
`crypto_policy()`	現在アクティブな暗号ポリシー。
`crypto_set_policy(spec)`	`spec` からアクティブな暗号ポリシーを設定し、ステータスコードを返す。
`crypto_fips_available()`	FIPS プロバイダが利用可能かどうか。
`crypto_use_fips()`	FIPS モードが有効かどうか。

エラー処理

C ABI が成功以外のステータスを返すと、必ず PdfOxideError が送出されます。

using PdfOxide

try
    doc = open_document("file.pdf")
    text = extract_text(doc, 0)
    close!(doc)
catch e
    if e isa PdfOxideError
        @warn "PDF error" code=e.code op=e.op
    else
        rethrow()
    end
end

PdfOxideError <: Exception は、数値の C ABI code と失敗した操作名を保持します。

完全な例

using PdfOxide

# --- Creation ---
doc_bytes = let b = DocumentBuilder()
    set_title(b, "Report")
    p = letter_page(b)
    font(p, "Helvetica", 18)
    at(p, 72, 720)
    heading(p, 1, "Quarterly Report")
    paragraph(p, "Generated by PDF Oxide.")
    done(p)
    out = build(b)
    close!(b)
    out
end

# --- Extraction ---
doc = open_from_bytes(doc_bytes)
println("Pages: ", page_count(doc))
for i in 0:page_count(doc)-1
    println("Page ", i + 1, ": ", length(extract_text(doc, i)), " chars")
end

# Words + tables
words = extract_words(doc, 0)
tables = extract_tables(doc, 0)

# Search
for hit in search_all(doc, "report", false)
    println("Found on page ", hit.page, " at ", hit.bbox)
end
close!(doc)

# --- Editing ---
e = open_editor("input.pdf")
rotate_all_pages(e, 90)
set_form_field_value(e, "name", "Jane Doe")
flatten_forms(e)
save(e, "output.pdf")
close!(e)

# --- Rendering ---
doc = open_document("input.pdf")
img = render_page(doc, 0)          # PNG
save(img, "page0.png")
close!(img)
close!(doc)

他の言語のバインディング

PDF Oxide はあらゆる主要なエコシステム向けにネイティブバインディングを提供しています： Rust, Python, Node.js, WASM, C#, Golang, Java, PHP, Ruby, C++, Swift, Kotlin, Dart, R, Zig, Scala, Clojure, Objective-C, Elixir。

次のステップ

型と列挙型 — すべての共有型と列挙型
Page API リファレンス — バインディング間で一貫したページ単位の反復処理
Julia 入門 — チュートリアル