Node.js API 参考
pdf-oxide npm 包提供原生 N-API 插件,并附带完整的 TypeScript 类型定义。预编译的平台二进制通过各平台专属的子包分发,外加一层轻量的 TypeScript 封装,提供下文所述的基于类的 API、manager API、builder API 和流式 API。
npm install pdf-oxide
如需面向浏览器 / Deno / Bun / 边缘运行时的 WASM 构建版本,请参阅 WASM API 参考。其他语言请参阅 Python、Golang、C# 或 Rust。
包导出
import {
// Core
PdfDocument, Pdf, DocumentEditor,
// Write-side fluent API
DocumentBuilder, PageBuilder, EmbeddedFont, PdfBuilder,
// Managers
ExtractionManager, SearchManager, MetadataManager, OutlineManager,
AnnotationManager, EditingManager, FormFieldManager, XfaManager,
RenderingManager, ThumbnailManager, RenderOptions,
SecurityManager, ComplianceManager, AccessibilityManager,
OptimizationManager, LayerManager, BarcodeManager, BatchManager,
CacheManager, EnterpriseManager, HybridMLManager,
ResultAccessorsManager, OcrManager, SignatureManager,
// Builders
AnnotationBuilder, MetadataBuilder, SearchOptionsBuilder,
ConversionOptionsBuilder, StreamingTable,
// Streams & workers
SearchStream, ExtractionStream, MetadataStream,
createSearchStream, createExtractionStream, createMetadataStream,
WorkerPool, workerPool,
// Signatures / timestamps
Timestamp, TimestampHashAlgorithm, TsaClient, signPdfBytesPades,
// Value types & enums
Rect, Point, Color, PageSize,
// Errors
PdfError, PdfException,
} from "pdf-oxide";
所有由原生层支撑的类都暴露了 close()(必要时还有 Symbol.dispose)——在 Node.js 22+ 中可用 using 自动清理资源。
顶层函数
getVersion(): string // binding version string
getPdfOxideVersion(): string // underlying Rust core version
getActiveCryptoProvider(): string // active crypto backend name
isFipsCryptoAvailable(): boolean
useFipsCryptoProvider(): void // switch to the FIPS provider
// Runtime crypto governance (#230)
setCryptoPolicy(spec: string): void // "compat" | "strict" | "fips-strict"[;…]
cryptoPolicy(): string // active policy as its grammar string
cryptoInventory(): string[] // algorithm tokens used this process
cryptoCbom(): string // CycloneDX 1.6 crypto BOM (JSON)
// OCR model provisioning (process-wide, no handle)
prefetchModels(languages?: string[]): string // download models, returns cache dir
modelManifest(): string // air-gapped model manifest (JSON)
prefetchAvailable(): boolean // true if built with `ocr` feature
// Barcode helpers
generateBarcodeSvg(data: string, format?: number, sizePx?: number): string
generateQrCodeSvg(data: string, errorCorrection?: number, sizePx?: number): string
// PAdES signing of raw bytes (#235)
signPdfBytesPades(
pdfData: Buffer | Uint8Array,
cert: { certPem: string; keyPem: string } | { pkcs12: Buffer | Uint8Array; password: string },
level: PadesLevelValue,
opts?: { tsaUrl?: string; reason?: string; location?: string; revocation?: object }
): Buffer
// Stream factories (see Streams)
createSearchStream(manager, term, options?): SearchStream
createExtractionStream(manager, startPage, endPage, type?, options?): ExtractionStream
createMetadataStream(manager, startPage, endPage): MetadataStream
// Error utilities
wrapError(error): PdfError
wrapMethod(fn): Function
wrapAsyncMethod(fn): Function
mapFfiErrorCode(code: number): typeof PdfError
PdfDocument
对 PDF 文档的只读访问,外加转换与渲染。
构造方法
PdfDocument.open(path: string): PdfDocument
PdfDocument.openFromBuffer(data: Buffer | Uint8Array): PdfDocument
PdfDocument.openWithPassword(path: string, password: string): PdfDocument
PdfDocument.openFromDocxBytes(data: Buffer | Uint8Array): PdfDocument // Office → PDF in memory
PdfDocument.openFromPptxBytes(data: Buffer | Uint8Array): PdfDocument
PdfDocument.openFromXlsxBytes(data: Buffer | Uint8Array): PdfDocument
文档信息
pageCount(): number
getPageCount(): number
getVersion(): { major: number; minor: number }
hasStructureTree(): boolean
hasXFA(): boolean
hasDocumentTimestamp(): boolean
getDocumentSecurityStore(): DocumentSecurityStore | null
getPageWidth(pageIndex: number): number
getPageHeight(pageIndex: number): number
getPageRotation(pageIndex: number): number
close(): void
页面
page(index: number): Page // lightweight lazy page handle (see Page)
文本提取
extractText(pageIndex: number): string
extractTextAsync(pageIndex: number): Promise<string>
extractAllText(): string
extractWords(pageIndex: number): Word[]
extractTextLines(pageIndex: number): TextLine[]
extractTables(pageIndex: number): Table[]
extractPaths(pageIndex: number): Path[]
classifyPage(pageIndex: number): string // page content classification
classifyDocument(): string // whole-document classification
extractTextAuto(pageIndex: number): string // auto OCR-or-text per page
extractPageAuto(pageIndex: number, optionsJson?: string): string
ocrExtractText(pageIndex: number, engineHandle: any): string
转换
toMarkdown(pageIndex: number, options?: ConversionOptions): string
toMarkdownAsync(pageIndex: number, options?: ConversionOptions): Promise<string>
toMarkdownAll(options?: ConversionOptions): string
toHtml(pageIndex: number, options?: ConversionOptions): string
toHtmlAll(options?: ConversionOptions): string
toPlainText(pageIndex: number): string
toPlainTextAll(): string
toDocxBytes(): Buffer // PDF → DOCX (round-trip)
toPptxBytes(): Buffer
toXlsxBytes(): Buffer
资源与结构
getFormFields(): FormField[]
getOutline(): OutlineItem[]
getPageAnnotations(pageIndex: number): AnnotationInfo[]
getEmbeddedFonts(pageIndex: number): FontInfo[]
getEmbeddedImages(pageIndex: number): ImageInfo[]
planSplitByBookmarks(options?: SplitByBookmarksOptions): SplitSegment[]
搜索
searchPage(pageIndex: number, query: string, caseSensitive?: boolean): SearchResult[]
searchAll(query: string, caseSensitive?: boolean): SearchResult[]
渲染
renderPageWithOptions(pageIndex: number, options?: RenderOptions): Uint8Array
renderPageWithOptionsAsync(pageIndex: number, options?: RenderOptions): Promise<Uint8Array>
renderPageFit(pageIndex: number, width: number, height: number, format?: "png" | "jpeg"): Uint8Array
renderPageFitAsync(pageIndex: number, width: number, height: number, format?: "png" | "jpeg"): Promise<Uint8Array>
renderToPixmap(pageIndex: number, dpi?: number): RgbaPixmap
renderToPixmapAsync(pageIndex: number, dpi?: number): Promise<RgbaPixmap>
estimateRenderTime(pageIndex: number, dpi?: number): number
RenderOptions(对象字面量):{ dpi?, format?: "png" | "jpeg", jpegQuality?, background?: [r,g,b,a], renderAnnotations?, transparentBackground? }。
合规
validatePdfA(level?: "1a" | "1b" | "2a" | "2b" | "2u" | "3a" | "3b" | "3u"): { compliant: boolean; errors: string[]; warnings: string[] }
convertToPdfA(level?: "1a" | "1b" | "2a" | "2b" | "2u" | "3a" | "3b" | "3u"): boolean
toBuffer(): Buffer
Page
由 doc.page(index) 返回的轻量惰性页面句柄。每个方法都会转发到父文档。
class Page {
readonly index: number;
readonly width: number;
readonly height: number;
text(): string;
markdown(): string;
html(): string;
plainText(): string;
words(): Word[];
lines(): TextLine[];
tables(): Table[];
images(): ImageInfo[];
paths(): Path[];
annotations(): AnnotationInfo[];
fonts(): FontInfo[];
search(query: string, caseSensitive?: boolean): SearchResult[];
toString(): string;
}
Pdf — 创建
从各种源格式创建 PDF。
Pdf.fromMarkdown(markdown: string): Pdf
Pdf.fromHtml(html: string): Pdf
Pdf.fromText(text: string): Pdf
Pdf.fromImage(path: string): Pdf
Pdf.fromImageBytes(data: Buffer | Uint8Array): Pdf
Pdf.fromHtmlCss(html: string, css: string, fontBytes: Buffer | Uint8Array): Pdf
Pdf.fromHtmlCssWithFonts(html: string, css: string, fonts: [string, Buffer | Uint8Array][]): Pdf
save(path: string): void
saveToBytes(): Buffer
pageCount(): number
close(): void
DocumentBuilder / PageBuilder / EmbeddedFont
底层流式文档创作,支持嵌入字体、注释、AcroForm 控件、表格与图形。
EmbeddedFont
EmbeddedFont.fromFile(path: string): EmbeddedFont
EmbeddedFont.fromBytes(data: Buffer | Uint8Array, name?: string): EmbeddedFont
markConsumed(): void
close(): void
DocumentBuilder
DocumentBuilder.create(): DocumentBuilder
// Metadata & document setup
title(title: string): this
author(author: string): this
subject(subject: string): this
keywords(keywords: string): this
creator(creator: string): this
onOpen(script: string): this // document-open JavaScript
taggedPdfUa1(): this // emit Tagged PDF / PDF-UA-1
language(lang: string): this
roleMap(custom: string, standard: string): this
registerEmbeddedFont(name: string, font: EmbeddedFont): this
// Page factories — return a PageBuilder
a4Page(): PageBuilder
letterPage(): PageBuilder
page(width: number, height: number): PageBuilder
// Output
build(): Buffer
save(path: string): void
saveEncrypted(path: string, userPassword: string, ownerPassword: string): void
toBytesEncrypted(userPassword: string, ownerPassword: string): Buffer
clearOpenPage(): void
close(): void
PageBuilder
由 a4Page() / letterPage() / page(w, h) 返回。用 done() 返回父对象。
// Text & layout
font(name: string, size: number): this
at(x: number, y: number): this
text(text: string): this
heading(level: number, text: string): this
paragraph(text: string): this
space(points: number): this
horizontalRule(): this
columns(columnCount: number, gapPt: number, text: string): this
footnote(refMark: string, noteText: string): this
newline(): this
newPageSameSize(): this
// Inline runs
inline(text: string): this
inlineBold(text: string): this
inlineItalic(text: string): this
inlineColor(r: number, g: number, b: number, text: string): this
// Link annotations
linkUrl(url: string): this
linkPage(pageIndex: number): this
linkNamed(destination: string): this
linkJavascript(script: string): this
// Document/field JavaScript actions
onOpen(script: string): this
onClose(script: string): this
fieldKeystroke(script: string): this
fieldFormat(script: string): this
fieldValidate(script: string): this
fieldCalculate(script: string): this
// Markup annotations
highlight(r: number, g: number, b: number): this
underline(r: number, g: number, b: number): this
strikeout(r: number, g: number, b: number): this
squiggly(r: number, g: number, b: number): this
stickyNote(text: string): this
stickyNoteAt(x: number, y: number, text: string): this
freeText(x: number, y: number, w: number, h: number, text: string): this
stamp(typeName: string): this
watermark(text: string): this
watermarkConfidential(): this
watermarkDraft(): this
// AcroForm widgets
textField(name: string, x: number, y: number, w: number, h: number, defaultValue?: string): this
checkbox(name: string, x: number, y: number, w: number, h: number, checked?: boolean): this
comboBox(name: string, x: number, y: number, w: number, h: number, options: string[], selected?: string): this
radioGroup(name: string, buttons: { value: string; x: number; y: number; w: number; h: number }[], selected?: string): this
pushButton(name: string, x: number, y: number, w: number, h: number, caption: string): this
signatureField(name: string, x: number, y: number, w: number, h: number): this
// Barcodes & images
barcode1d(barcodeType: number, data: string, x: number, y: number, w: number, h: number): this
barcodeQr(data: string, x: number, y: number, size: number): this
image(bytes: Buffer | Uint8Array, x: number, y: number, w: number, h: number): this
imageWithAlt(bytes: Buffer | Uint8Array, x: number, y: number, w: number, h: number, altText: string): this
imageArtifact(bytes: Buffer | Uint8Array, x: number, y: number, w: number, h: number): this
// Graphics primitives
rect(x: number, y: number, w: number, h: number): this
filledRect(x: number, y: number, w: number, h: number, r: number, g: number, b: number): this
strokeRect(x: number, y: number, w: number, h: number, lineWidth: number, r: number, g: number, b: number): this
line(x1: number, y1: number, x2: number, y2: number): this
// Tables
table(spec: TableSpec): this
done(): DocumentBuilder
PdfBuilder
高层、以元数据为先的文档构建器,从某种源格式产出一个 Pdf。
PdfBuilder.create(): PdfBuilder
title(title: string): this
author(author: string): this
subject(subject: string): this
keywords(keywords: string[]): this
addKeyword(keyword: string): this
pageSize(pageSize: string): this
margins(top: number, right: number, bottom: number, left: number): this
fromMarkdown(markdown: string): Pdf
fromHtml(html: string): Pdf
fromText(text: string): Pdf
fromHtmlCss(html: string, css: string, fontBytes: Buffer | Uint8Array): Pdf
fromMarkdownAsync(markdown: string): Promise<Pdf>
fromHtmlAsync(html: string): Promise<Pdf>
fromTextAsync(text: string): Promise<Pdf>
DocumentEditor
原地修改:页面、元数据、表单、注释、涂黑以及加密保存。
构造方法
DocumentEditor.open(path: string): DocumentEditor
DocumentEditor.openFromBytes(data: Buffer | Uint8Array): DocumentEditor
信息与元数据
pageCount(): number
isModified(): boolean
setTitle(title: string): void
setAuthor(author: string): void
setSubject(subject: string): void
getKeywords(): string | null
setKeywords(keywords: string): void
getProducer(): string
setProducer(producer: string): void
getCreationDate(): string
setCreationDate(date: string): void
页面操作
deletePage(pageIndex: number): void
movePage(fromIndex: number, toIndex: number): void
setPageRotation(pageIndex: number, degrees: 0 | 90 | 180 | 270): void
rotateAllPages(degrees: number): void
rotatePageBy(pageIndex: number, degrees: number): void
getPageMediaBox(pageIndex: number): [number, number, number, number]
setPageMediaBox(pageIndex: number, x: number, y: number, width: number, height: number): void
getPageCropBox(pageIndex: number): [number, number, number, number]
setPageCropBox(pageIndex: number, x: number, y: number, width: number, height: number): void
eraseRegions(pageIndex: number, rects: [number, number, number, number][]): void
clearEraseRegions(pageIndex: number): void
表单与注释
flattenForms(): void
flattenFormsOnPage(pageIndex: number): void
flattenAnnotations(pageIndex?: number): void
flattenWarnings(): string[]
setFormFieldValue(fieldName: string, value: string): void
importFdfBytes(fdf: Buffer | Uint8Array): void
importXfdfBytes(xfdf: Buffer | Uint8Array): void
isPageMarkedForFlatten(pageIndex: number): boolean
unmarkPageForFlatten(pageIndex: number): void
涂黑与合并
applyPageRedactions(pageIndex: number): void
applyAllRedactions(): void
isPageMarkedForRedaction(pageIndex: number): boolean
unmarkPageForRedaction(pageIndex: number): void
mergeFrom(sourcePath: string): void
mergeFromBytes(data: Buffer | Uint8Array): number
embedFile(name: string, data: Buffer | Uint8Array): void
保存
save(path: string): void
saveEncrypted(path: string, userPassword: string, ownerPassword: string): void
saveToBytes(): Buffer
saveToBytesWithOptions(compress: boolean, garbageCollect: boolean, linearize: boolean): Buffer
extractPagesToBytes(pageIndices: number[]): Buffer
close(): void
Manager 类
manager 类封装一个 PdfDocument(new XManager(doc)),将相关操作归为一组。大多数访问器方法是 async 的。许多 manager 还暴露 clearCache() / getCacheStats() 和 destroy()。
ExtractionManager
new ExtractionManager(doc: PdfDocument)
extractText(pageIndex: number, options?: object): string
extractAllText(options?: object): string
extractTextBatch(pageIndices: number[], options?: object): string
extractMarkdown(pageIndex: number, options?: object): string
extractAllMarkdown(options?: object): string
getPageWordCount(pageIndex: number): number
getTotalWordCount(): number
getPageCharacterCount(pageIndex: number): number
getTotalCharacterCount(): number
getPageLineCount(pageIndex: number): number
getContentStatistics(): ContentStatistics
searchContent(searchText: string, contextLength?: number): SearchMatch[]
SearchManager
new SearchManager(doc: PdfDocument)
search(searchText: string, pageIndex: number, options?: object): SearchResult[]
searchAll(searchText: string, options?: object): SearchResult[]
countOccurrences(searchText: string, pageIndex: number, options?: object): number
countAllOccurrences(searchText: string, options?: object): number
contains(searchText: string, pageIndex: number, options?: object): boolean
containsAnywhere(searchText: string, options?: object): boolean
getPagesContaining(searchText: string, options?: object): number[]
getSearchStatistics(searchText: string, options?: object): SearchStatistics
findFirst(searchText: string, options?: object): SearchResult | null
findLast(searchText: string, options?: object): SearchResult | null
highlightMatches(searchText: string, options?: object): SearchResult[]
isSearchable(): boolean
getCapabilities(): SearchCapabilities
MetadataManager
new MetadataManager(doc: PdfDocument)
getTitle(): string | null
getAuthor(): string | null
getSubject(): string | null
getKeywords(): string[]
getCreator(): string | null
getProducer(): string | null
getCreationDate(): Date | null
getModificationDate(): Date | null
getAllMetadata(): Record<string, any>
hasMetadata(): boolean
getMetadataSummary(): string
hasKeyword(keyword: string): boolean
getKeywordCount(): number
compareWith(otherDocument: PdfDocument): MetadataComparison
validate(): ValidationResult
OutlineManager
new OutlineManager(doc: PdfDocument)
hasOutlines(): boolean
getOutlineCount(): number
getOutlines(): OutlineItem[]
findByTitle(titleFragment: string): OutlineItem | null
findAllByTitle(titleFragment: string): OutlineItem[]
getOutlinesForPage(pageIndex: number): OutlineItem[]
pageHasOutlines(pageIndex: number): boolean
getOutlineAt(index: number): OutlineItem | null
containsPageNumber(pageNumber: number): boolean
AnnotationManager
new AnnotationManager(doc: PdfDocument)
getAnnotations(): Annotation[]
getAnnotationsByType(type: string): Annotation[]
getAnnotationCount(): number
getAnnotationsByAuthor(author: string): Annotation[]
getAnnotationAuthors(): string[]
getAnnotationsAfter(date: Date): Annotation[]
getAnnotationsBefore(date: Date): Annotation[]
getAnnotationsWithContent(contentFragment: string): Annotation[]
getHighlights(): Annotation[]
getComments(): Annotation[]
getUnderlines(): Annotation[]
getStrikeouts(): Annotation[]
getSquigglies(): Annotation[]
getAnnotationStatistics(): AnnotationStatistics
getRecentAnnotations(days: number): Annotation[]
generateAnnotationSummary(): string
validateAnnotation(annotation: any): AnnotationValidation
EditingManager
new EditingManager(doc: PdfDocument)
addRedaction(page: number, rect: RedactionRect, color?: RgbColor): void
applyRedactions(options?: ApplyRedactionsOptions): number
getRedactionCount(page?: number): number
scrubMetadata(options?: ScrubMetadataOptions): void
flattenForms(): void
flattenFormsPage(page: number): void
flattenAnnotations(): void
flattenAnnotationsPage(page: number): void
importFormDataFromFile(filePath: string): number
importFdfBytes(data: Buffer): number
importXfdfBytes(data: Buffer): number
exportFormDataToBytes(format?: 0 | 1): Buffer
FormFieldManager
new FormFieldManager(doc: PdfDocument)
getAllFields(): Promise<FormField[]>
getField(fieldName: string): Promise<FormField | undefined>
getFieldsOfType(fieldType: FormFieldType): Promise<FormField[]>
getFieldValue(fieldName: string): Promise<string | undefined>
setFieldValue(fieldName: string, value: string): Promise<void>
getFieldCount(): Promise<number>
hasForm(): Promise<boolean>
createField(config: FormFieldConfig): Promise<void>
removeField(fieldName: string): Promise<void>
flattenForm(): Promise<void>
resetForm(): Promise<void>
getFormAcroform(): Promise<any>
exportFormData(filename: string, format?: number): Promise<number>
exportFormDataBytes(format?: number): Promise<Uint8Array>
importFormData(filename: string): Promise<number>
resetAllFields(): Promise<number>
getFieldDefaultValue(fieldName: string): Promise<string>
setFieldDefaultValue(fieldName: string, value: string): Promise<void>
getFieldFlags(fieldName: string): Promise<number>
setFieldFlags(fieldName: string, flags: number): Promise<void>
getFieldTooltip(fieldName: string): Promise<string>
setFieldTooltip(fieldName: string, tooltip: string): Promise<void>
getFieldAlternateName(fieldName: string): Promise<string>
setFieldAlternateName(fieldName: string, alternateName: string): Promise<void>
isFieldReadonly(fieldName: string): Promise<boolean>
setFieldReadonly(fieldName: string, readonly: boolean): Promise<void>
isFieldRequired(fieldName: string): Promise<boolean>
setFieldRequired(fieldName: string, required: boolean): Promise<void>
getFieldBackgroundColor(fieldName: string): Promise<[number, number, number] | null>
getFieldTextColor(fieldName: string): Promise<[number, number, number] | null>
validateField(fieldName: string): Promise<boolean>
getFormStatistics(): Promise<Record<string, number>>
batchSetValues(values: Record<string, string>): Promise<number>
getBatchValues(fieldNames: string[]): Promise<Record<string, string>>
XfaManager
new XfaManager(doc: PdfDocument)
hasXfa(): boolean
parseXfaForm(): any
extractFieldData(): Record<string, string | undefined>
getDatasetXml(): string
convertToAcroForm(): boolean
getFieldCount(): Promise<number>
getFieldByIndex(index: number): Promise<XFAFormField | null>
getFieldValue(fieldName: string): Promise<string | null>
setFieldValue(fieldName: string, value: string): Promise<boolean>
getFieldType(fieldName: string): Promise<string | null>
isFieldReadOnly(fieldName: string): Promise<boolean>
getFieldBounds(fieldName: string): Promise<[number, number, number, number] | null>
getFormState(): Promise<Record<string, any> | null>
exportData(filePath: string): Promise<boolean>
importData(filePath: string): Promise<boolean>
flattenForm(): Promise<boolean>
createXfaForm(config: XfaTemplateConfig): Promise<XfaCreationResult>
createFromXdpTemplate(xdpContent: string): Promise<XfaCreationResult>
createFromXmlTemplate(xmlTemplate: string): Promise<XfaCreationResult>
addSubform(parentPath: string, config: XfaSubformConfig): Promise<boolean>
removeXfaForm(): Promise<boolean>
addTextField(pageIndex: number, config: XfaFieldConfig): Promise<XfaFieldHandle | null>
updateField(fieldId: string, updates: Partial<XfaFieldConfig>): Promise<boolean>
removeField(fieldId: string): Promise<boolean>
importXfaData(data: string, options: XfaDataOptions): Promise<boolean>
exportXfaData(options: XfaDataOptions): Promise<string | null>
exportAsXdp(): Promise<string | null>
addFieldScript(fieldName: string, script: XfaScriptConfig): Promise<boolean>
addFormScript(script: XfaScriptConfig): Promise<boolean>
RenderingManager 与 RenderOptions
new RenderOptions(config?: { dpi?: number; format?: "png" | "jpeg"; quality?: number; maxWidth?: number; maxHeight?: number })
RenderOptions.merge(options: RenderOptions | object | null): RenderOptions
RenderOptions.fromQuality(quality: "draft" | "normal" | "high"): RenderOptions
toJSON(): Record<string, any>
new RenderingManager(doc: PdfDocument)
getMaxResolution(): number
getSupportedColorSpaces(): string[]
getPageDimensions(pageIndex: number): PageDimensions
getDisplaySize(pageIndex: number, zoomLevel: number): PageDimensions
getPageRotation(pageIndex: number): number
getPageCropBox(pageIndex: number): PageBox
getPageMediaBox(pageIndex: number): PageBox
getPageBleedBox(pageIndex: number): PageBox
getPageTrimBox(pageIndex: number): PageBox
getPageArtBox(pageIndex: number): PageBox
calculateZoomForWidth(pageIndex: number, viewportWidth: number): number
calculateZoomForHeight(pageIndex: number, viewportHeight: number): number
calculateZoomToFit(pageIndex: number, viewportWidth: number, viewportHeight: number): number
getEmbeddedFonts(pageIndex: number): any[]
getEmbeddedImages(pageIndex: number): any[]
getPageResources(pageIndex: number): PageResources
getRecommendedResolution(quality: "draft" | "normal" | "high"): number
getRenderingStatistics(): RenderingStatistics
canRenderPage(pageIndex: number): boolean
validateRenderingState(): object
ThumbnailManager
new ThumbnailManager(doc: PdfDocument)
generateThumbnail(pageIndex: number, config?: ThumbnailConfig): Promise<Buffer>
generateAllThumbnails(config?: ThumbnailConfig): Promise<Map<number, Buffer>>
getThumbnailInfo(pageIndex: number): Promise<ThumbnailInfo>
preloadThumbnails(config?: ThumbnailConfig): Promise<void>
getStatistics(): ThumbnailStatistics
SecurityManager
new SecurityManager(doc: PdfDocument)
isEncrypted(): boolean
requiresPassword(): boolean
getEncryptionAlgorithm(): string | null
canPrint(): boolean
canCopy(): boolean
canModify(): boolean
canAnnotate(): boolean
canFillForms(): boolean
isViewOnly(): boolean
getPermissionsSummary(): PermissionsSummary
getSecurityLevel(): SecurityLevel
validateAccessibility(): AccessibilityValidation
generateSecurityReport(): string
SecurityManager.setCryptoPolicy(spec: string): void // static
SecurityManager.getCryptoPolicy(): string // static
SecurityManager.cryptoInventory(): string[] // static
SecurityManager.cryptoCbom(): string // static
ComplianceManager
new ComplianceManager(doc: PdfDocument)
getAllIssues(): Promise<ComplianceIssue[]>
getIssuesOfType(type: ComplianceIssueType): Promise<ComplianceIssue[]>
getIssueCount(): Promise<number>
getErrorCount(): Promise<number>
getWarningCount(): Promise<number>
convertToPdfA(level?: string): Promise<boolean>
convertToPdfUA(): Promise<boolean>
getComplianceReport(complianceType?: string): Promise<string>
checkFontEmbedding(): Promise<boolean>
hasFontsEmbedded(): Promise<boolean>
checkColorSpace(): Promise<boolean>
hasValidColorSpace(): Promise<boolean>
checkTaggedContent(): Promise<boolean>
addMissingTags(): Promise<boolean>
fixFontIssues(): Promise<number>
fixColorIssues(): Promise<number>
removeUnsupportedFeatures(): Promise<number>
getComplianceIssues(): Promise<string[]>
getIssueSeverity(issue: string): string
createComplianceReportFile(filePath: string): Promise<boolean>
getComplianceSummary(): Promise<object>
AccessibilityManager
new AccessibilityManager(doc: PdfDocument)
isTagged(): Promise<boolean>
getStructureTree(): Promise<StructureTree>
autoTag(language?: string): Promise<AutoTagResult>
setAltText(page: number, mcid: number, text: string): Promise<void>
setLanguage(lang: string): Promise<void>
setTitle(title: string): Promise<void>
OptimizationManager
new OptimizationManager(doc: PdfDocument)
subsetFonts(): Promise<OptimizationResult>
downsampleImages(dpi?: number, quality?: number): Promise<OptimizationResult>
deduplicate(): Promise<OptimizationResult>
optimizeFull(dpi?: number, quality?: number): Promise<OptimizationResult>
LayerManager
new LayerManager(doc: PdfDocument)
hasLayers(): boolean
getLayerCount(): number
getLayers(): Layer[]
getLayerByName(name: string): Layer | null
getLayerById(id: string): Layer | null
getRootLayers(): Layer[]
getLayerHierarchy(): LayerHierarchy
getChildLayers(parentId: string): Layer[]
getParentLayer(layerId: string): Layer | null
isLayerVisible(layerId: string): boolean
getVisibilityChain(layerId: string): Layer[]
getLayerUsages(): Record<string, number>
getLayerStatistics(): LayerStatistics
findLayersByPattern(pattern: RegExp | string): Layer[]
validateLayerState(): LayerValidation
BarcodeManager
new BarcodeManager(doc: PdfDocument)
detectBarcodes(pageIndex: number): Promise<DetectedBarcode[]>
detectAllBarcodes(): Promise<Map<number, DetectedBarcode[]>>
getBarcodesOfFormat(format: BarcodeFormat, pageIndex?: number): Promise<DetectedBarcode[]>
getBarcodeCount(): Promise<number>
getCountByFormat(format: BarcodeFormat): Promise<number>
hasBarcode(pageIndex: number): Promise<boolean>
generateBarcode(data: string, config?: BarcodeGenerationConfig): Promise<Buffer>
generateSvg(data: string, config?: BarcodeGenerationConfig): Promise<string>
barcodeToPng(barcodeData: Buffer, sizePx?: number): Promise<Buffer>
barcodeToSvg(barcodeData: Buffer, sizePx?: number): Promise<string>
detectBarcodeFormat(barcodeData: Buffer): BarcodeFormat
decodeBarcodeData(barcodeData: Buffer): string
getDetectionConfidence(barcodeData: Buffer): number
BatchManager
new BatchManager(documents: BatchDocument[])
extractTextBatch(options?: BatchOptions): Promise<BatchResult>
extractMarkdownBatch(options?: BatchOptions): Promise<BatchResult>
extractHtmlBatch(options?: BatchOptions): Promise<BatchResult>
searchBatch(searchText: string, options?: BatchOptions): Promise<BatchResult>
getStatistics(): BatchStatistics
CacheManager
new CacheManager()
set<T>(key: string, value: T, scope?: CacheScope, ttl?: number): void
get<T>(key: string): T | undefined
has(key: string): boolean
delete(key: string): boolean
clear(): void
clearScope(scope: CacheScope): number
getStatistics(): CacheStatistics
setTtl(key: string, ttl: number): boolean
getKeys(scope?: CacheScope): string[]
prune(): number
destroy(): void
EnterpriseManager
new EnterpriseManager(doc: PdfDocument)
applyBates(...): Promise<boolean> // Bates numbering
applyBatesAdvanced(...): Promise<boolean>
comparePages(...): Promise<PageComparisonResult>
compareDocuments(docA: any, docB: any): Promise<DocumentComparisonResult>
stampHeader(...): Promise<boolean>
stampFooter(...): Promise<boolean>
stampHeaderFooter(...): Promise<boolean>
HybridMLManager
new HybridMLManager(doc: PdfDocument)
analyzePage(pageIndex: number): Promise<PageAnalysisResult>
analyzeDocument(): Promise<PageAnalysisResult[]>
getExtractionStrategy(pageIndex: number): Promise<ExtractionStrategy>
detectTables(pageIndex: number): Promise<TableRegion[]>
detectColumns(pageIndex: number): Promise<ColumnRegion[]>
getAverageComplexity(): Promise<number>
getMostCommonContentType(): Promise<ContentType>
estimateProcessingTime(): Promise<number>
ResultAccessorsManager
针对原生的搜索/字体/图像/注释结果句柄的惰性属性访问器。
new ResultAccessorsManager(doc: PdfDocument)
// Search results
getSearchResultLineNumber(results, index): Promise<number>
getSearchResultParagraphNumber(results, index): Promise<number>
getSearchResultConfidence(results, index): Promise<number>
isSearchResultHighlighted(results, index): Promise<boolean>
getSearchResultFontInfo(results, index): Promise<string>
getSearchResultColor(results, index): Promise<[number, number, number]>
getSearchResultRotation(results, index): Promise<number>
getSearchResultObjectId(results, index): Promise<number>
getSearchResultStreamIndex(results, index): Promise<number>
getSearchResultAllProperties(results, index): Promise<SearchResultProperties>
// Fonts
getFontBaseFontName(fonts, index): Promise<string>
getFontDescriptor(fonts, index): Promise<string>
getFontDescendantFont(fonts, index): Promise<string>
getFontToUnicodeCmap(fonts, index): Promise<string>
isFontVertical(fonts, index): Promise<boolean>
getFontWidths(fonts, index): Promise<Float32Array>
getFontAscender(fonts, index): Promise<number>
getFontDescender(fonts, index): Promise<number>
getFontAllProperties(fonts, index): Promise<FontProperties>
// Images
hasImageAlphaChannel(images, index): Promise<boolean>
getImageIccProfile(images, index): Promise<Uint8Array>
getImageFilterChain(images, index): Promise<string>
getImageDecodedData(images, index): Promise<Uint8Array>
getImageWidth(images, index): Promise<number>
getImageHeight(images, index): Promise<number>
getImageColorSpace(images, index): Promise<string>
getImageAllProperties(images, index): Promise<ImageProperties>
// Annotations
getAnnotationModifiedDate(annotations, index): Promise<number>
getAnnotationSubject(annotations, index): Promise<string>
getAnnotationReplyToIndex(annotations, index): Promise<number>
getAnnotationPageNumber(annotations, index): Promise<number>
getAnnotationIconName(annotations, index): Promise<string>
getAnnotationAuthor(annotations, index): Promise<string>
getAnnotationAllProperties(annotations, index): Promise<AnnotationProperties>
OcrManager(ocr feature)
new OcrManager(doc: PdfDocument, options?: ManagerOptions)
destroyOcrEngine(): Promise<void>
pageNeedsOcr(pageIndex: number): Promise<boolean>
recognizePage(pageIndex: number): Promise<string>
extractText(pageIndex: number, config?: OcrConfig): Promise<string>
extractSpans(pageIndex: number, config?: OcrConfig): Promise<OcrSpan[]>
getOcrConfidence(pageIndex: number): Promise<number>
detectTextRegions(pageIndex: number): Promise<TextRegion[]>
setOcrLanguage(language: OcrLanguage | string): Promise<boolean>
setLanguage(language: OcrLanguage): void
getLanguage(): OcrLanguage
getAvailableLanguages(): Promise<OcrLanguage[]>
preprocessPage(pageIndex: number, preprocessingType?: string): Promise<boolean>
getOcrStatistics(pageIndex: number): Promise<OcrResult>
batchRecognizePages(startPage: number, endPage: number): Promise<Map<number, string>>
analyzePage(pageIndex: number, config?: OcrConfig): Promise<OcrPageAnalysis>
analyzeDocument(config?: OcrConfig): Promise<OcrPageAnalysis[]>
getEngineStatus(): Promise<string>
getConfiguration(): object
isAvailable(): Promise<boolean>
getVersion(): Promise<string>
destroy(): Promise<void>
构建时加上 --features ocr(参见 OCR 指南)。
签名
SignatureManager
new SignatureManager(doc: PdfDocument)
// Inspect
getSignatures(): Promise<DigitalSignature[]>
getSignatureFields(): Promise<SignatureField[]>
getSignatureCount(): Promise<number>
isSigned(): Promise<boolean>
isCertified(): Promise<boolean>
verifySignatures(): Promise<SignatureValidationResult>
verifySignature(signatureName: string): Promise<SignatureValidationResult>
getSignerName(index: number): Promise<string>
getSigningTime(index: number): Promise<number>
getSigningReason(index: number): Promise<string | null>
getSigningLocation(index: number): Promise<string | null>
getCertificateSubject(index: number): Promise<string>
getCertificateIssuer(index: number): Promise<string>
getCertificateSerial(index: number): Promise<string>
getCertificateValidity(index: number): Promise<[number, number]>
getSignatureDetails(index: number): Promise<Signature | null>
isCertificateValidByIndex(index: number): Promise<boolean>
getCertificateInfo(certificateId: string): Promise<CertificateInfo | null>
getCertificateChain(certificateId: string): Promise<CertificateChain | null>
// Certificates / credentials
loadCertificateFromFile(filePath: string, password?: string, format?: CertificateFormat): Promise<LoadedCertificate | null>
loadCertificateFromBytes(certData: Buffer, password?: string, format?: CertificateFormat): Promise<LoadedCertificate | null>
loadCertificateFromPem(certificatePem: string, ...): Promise<LoadedCertificate | null>
loadCertificateFromDerBytes(certData: Buffer): Promise<any>
loadCredentialsPkcs12(filePath: string, password: string): Promise<SigningCredentials>
loadCredentialsFromDer(certData: Buffer, keyData?: Buffer): Promise<SigningCredentials>
addChainCert(credentials: SigningCredentials, certData: Buffer): Promise<void>
getCertificate(credentials: SigningCredentials): Promise<any>
getCertificateCn(certHandle: any): Promise<string>
getCertificateIssuerFromHandle(certHandle: any): Promise<string>
getCertificateSize(certHandle: any): Promise<number>
getLoadedCertificates(): readonly LoadedCertificate[]
unloadCertificate(certificateId: string): boolean
freeCredentials(credentials: SigningCredentials): void
freeCertificate(certHandle: any): void
// Sign
signDocument(fieldName: string, certificate: LoadedCertificate | string, options?: SigningOptions): Promise<SigningResult>
certifyDocument(fieldName: string, certificate: LoadedCertificate | string, permission: CertificationPermission, options?: SigningOptions): Promise<SigningResult>
signInvisibly(certificate: LoadedCertificate | string, options?: SigningOptions): Promise<SigningResult>
signMultipleFields(signings: { fieldName: string; certificate: LoadedCertificate | string; options?: SigningOptions }[]): Promise<SigningResult[]>
saveSigned(pdfData: Buffer, outputPath: string): Promise<void>
// Signature fields
addSignatureField(config: SignatureFieldConfig): Promise<boolean>
removeSignatureField(fieldName: string): Promise<boolean>
getSignatureFieldNames(): Promise<string[]>
hasSignatureField(fieldName: string): Promise<boolean>
// Timestamps / LTV
addTimestamp(pdfData: Buffer, signatureIndex: number, tsaUrl: string): Promise<Buffer>
addDocumentTimestamp(config: TimestampConfig): Promise<TimestampResult>
embedTimestamp(fieldName: string, config: TimestampConfig): Promise<TimestampResult>
getTimestampInfo(fieldName: string): Promise<object>
hasLtvEnabled(fieldName: string): Promise<boolean>
embedLtv(pdfData: Buffer, ocspData?: Buffer, crlData?: Buffer): Promise<Buffer>
Signature 对象(与 doc.signatures() 对应)
for (const sig of doc.signatures()) {
sig.signerName; sig.reason; sig.location; sig.signingTime;
sig.verify(); // "Valid" | "Invalid" | "Unknown"
sig.verifyDetached(pdfBytes); // boolean
const cert = sig.getCertificate();
cert.subject; cert.issuer; cert.serial;
cert.notBefore; cert.notAfter; cert.isValid;
}
Timestamp(RFC 3161)
Timestamp.parse(data: Buffer | Uint8Array): Timestamp
get time(): number
get serial(): string
get policyOid(): string
get tsaName(): string
get hashAlgorithm(): TimestampHashAlgorithm
get messageImprint(): Uint8Array
get token(): Uint8Array
verify(): boolean
close(): void
TsaClient
new TsaClient(options: { url: string; timeoutSeconds?: number; hashAlgorithm?: number; useNonce?: boolean; certReq?: boolean; username?: string; password?: string })
requestTimestamp(data: Buffer | Uint8Array): Timestamp
requestTimestampHash(digest: Buffer | Uint8Array, algorithm?: number): Timestamp
close(): void
Builder(选项与注释)
AnnotationBuilder
AnnotationBuilder.create(): AnnotationBuilder
asText(): this; asHighlight(): this; asUnderline(): this; asStrikeout(): this; asSquiggly(): this
type(type: string): this
content(content: string): this
author(author: string): this
subject(subject: string): this
color(rgb: number[]): this
colorName(colorName: string): this
opacity(opacity: number): this
bounds(bounds: AnnotationBounds): this
creationDate(date: Date): this
modificationDate(date: Date): this
printable(): this; notPrintable(): this
locked(locked: boolean): this
reply(replyContent: string): this
build(): Annotation
MetadataBuilder
MetadataBuilder.create(): MetadataBuilder
title(title: string): this
author(author: string): this
subject(subject: string): this
keywords(keywords: string[]): this
addKeyword(keyword: string): this
creator(creator: string): this
producer(producer: string): this
creationDate(date: Date): this
modificationDate(date: Date): this
withCurrentDate(): this
customProperty(key: string, value: string): this
customProperties(properties: Record<string, string>): this
build(): Metadata
SearchOptionsBuilder
SearchOptionsBuilder.create(): SearchOptionsBuilder
SearchOptionsBuilder.default(): SearchOptions
SearchOptionsBuilder.strict(): SearchOptions
SearchOptionsBuilder.regex(): SearchOptions
caseSensitive(sensitive: boolean): this
wholeWords(wholeOnly: boolean): this
useRegex(regex: boolean): this
ignoreAccents(ignore: boolean): this
maxResults(max: number): this
searchAnnotations(search: boolean): this
build(): SearchOptions
ConversionOptionsBuilder
ConversionOptionsBuilder.create(): ConversionOptionsBuilder
ConversionOptionsBuilder.default(): ConversionOptions
ConversionOptionsBuilder.textOnly(): ConversionOptions
ConversionOptionsBuilder.highQuality(): ConversionOptions
ConversionOptionsBuilder.fast(): ConversionOptions
preserveFormatting(preserve: boolean): this
detectHeadings(detect: boolean): this
detectTables(detect: boolean): this
detectLists(detect: boolean): this
includeImages(include: boolean): this
imageFormat(format: string): this
imageQuality(quality: number): this
maxImageDimension(maxDimension: number): this
outputEncoding(encoding: string): this
normalizeWhitespace(normalize: boolean): this
extractAnnotations(extract: boolean): this
useStructureTree(use: boolean): this
pageRange(start: number, end: number): this
build(): ConversionOptions
流
所有流类都是对象模式下的标准 Node.js Readable 流。
new SearchStream(manager: SearchManager, query: string, options?: object)
new ExtractionStream(manager: ExtractionManager, startPage: number, endPage: number, type?: "text" | "markdown" | "html", options?: object)
new MetadataStream(manager: RenderingManager, startPage: number, endPage: number)
// Factories
createSearchStream(manager, term, options?): SearchStream
createExtractionStream(manager, startPage, endPage, type?, options?): ExtractionStream
createMetadataStream(manager, startPage, endPage): MetadataStream
StreamingTable
将大型表格增量构建进 PageBuilder。
new StreamingTable(config: StreamingTableConfig)
pushRow(cells: (string | null | undefined)[]): this
pushRowSpan(cells: (SpanCell | string | null | undefined)[]): this
flush(): this
finish(): Promise<PageBuilder>
WorkerPool
new WorkerPool(poolSize?: number) // default 4
workerPool // shared default pool instance
参见 Node.js 流指南 和 并发指南。
值类型与枚举
new Rect(x: number, y: number, width: number, height: number)
rect.getRight(): number
rect.getBottom(): number
rect.contains(x: number, y: number): boolean
rect.intersects(other: Rect): boolean
new Point(x: number, y: number)
point.distanceTo(other: Point): number
new Color(red: number, green: number, blue: number)
Color.fromHex(hex: string): Color
color.toHex(): string
enum PageSize { Letter, Legal, A0, A1, A2, A3, A4, A5, A6, B4, B5, B6, Tabloid, Ledger, Custom }
其他导出的枚举:Align、BarcodeFormat、BarcodeErrorCorrection、ContentType、DigestAlgorithm、FieldVisibility、FormFieldType、ImageFormat、IssueSeverity、ComplianceIssueType、OCRDetectionMode、OCRLanguage、PageComplexity、PdfALevel、PdfUALevel、PdfXLevel、SignatureAlgorithm、ThumbnailSize、XfaFieldType、XfaFormType、ErrorCategory、ErrorSeverity。
数据类型
interface Word {
text: string;
x: number; y: number;
width: number; height: number;
}
interface TextLine {
text: string;
y: number;
spans: Span[];
}
interface Span {
text: string;
fontName: string;
fontSize: number;
bbox: [number, number, number, number];
}
interface ImageInfo {
width: number;
height: number;
format: "png" | "jpeg" | "tiff";
colorspace: "rgb" | "gray" | "cmyk" | "indexed";
bitsPerComponent: number;
data: Buffer;
}
interface FontInfo {
name: string;
type: string;
encoding: string;
isEmbedded: boolean;
isSubset: boolean;
size: number;
}
interface AnnotationInfo {
type: string;
subtype: string;
content: string;
x: number; y: number;
width: number; height: number;
author: string;
linkUri?: string;
}
interface FormField {
name: string;
fieldType: string;
value: string;
pageIndex: number;
}
interface SearchResult {
pageIndex: number;
text: string;
position: number;
bounds?: Rect;
}
interface ConversionOptions {
preserveFormatting?: boolean;
includeTables?: boolean;
includeImages?: boolean;
extractHeadings?: boolean;
extractLists?: boolean;
}
interface RgbaPixmap {
width: number;
height: number;
data: Uint8Array; // RGBA, 4 bytes/pixel
}
错误处理
失败时会抛出有类型的错误,它们都继承自 PdfError。可以检查 err.message,或基于子类做分支判断:
import { PdfError, PdfIoError, PdfParseError, PdfEncryptionError } from "pdf-oxide";
try {
const doc = PdfDocument.open("file.pdf");
const text = doc.extractText(0);
} catch (err) {
if (err instanceof PdfEncryptionError) console.error("wrong password");
else if (err instanceof PdfParseError) console.error("malformed PDF");
else console.error(`Extraction failed: ${err.message}`);
}
导出的错误类:PdfError / PdfException、PdfIoError、PdfParseError、PdfEncryptionError、PdfUnsupportedError、PdfInvalidStateError、PdfDecodeError、PdfEncodeError、PdfFontError、PdfImageError、PdfCircularReferenceError、PdfRecursionLimitError、PdfOcrError、PdfMlError、PdfBarcodeError,以及各 manager 专属的异常(AccessibilityException、ComplianceException、EncryptionException、OptimizationException、RedactionException、RenderingException、SearchException、SignatureException、ValidationException、IoException、ParseException、InvalidStateException、UnsupportedFeatureException、CertificateLoadFailed、SigningFailed、UnknownError)。
异步方法后缀
许多方法都有一个返回 Promise 的 *Async 变体(例如 extractText → extractTextAsync、toMarkdown → toMarkdownAsync、renderPageWithOptions → renderPageWithOptionsAsync)。异步变体会分发到 libuv 线程池,不会阻塞事件循环。大多数 manager 访问器方法本身就是 async 的。参见 异步指南。
线程安全
PdfDocument 在 Rust 侧满足 Send + Sync——可以安全地跨 Node.js Worker 线程共享。CPU 密集型的批量工作请使用 WorkerPool。参见 并发指南。
生成的类型
TypeScript 类型定义随包一起分发,位于 node_modules/pdf-oxide/lib/index.d.ts——这是类型的权威真相来源,包含本页最后更新后新增的所有字段。
HTML + CSS 流水线
const pdf = Pdf.fromHtmlCss(html, css, fontBytes);
const pdf = Pdf.fromHtmlCssWithFonts(html, css, [
["DejaVu Sans", font1],
["Noto Sans CJK", font2],
]);
多目标 WASM 打包
如果你在 Node.js 应用中使用 WASM 构建版本(pdf-oxide-wasm),请从 pdf-oxide-wasm/nodejs 导入。参见 JavaScript(WASM)API 参考 → 多目标打包。
Other Language Bindings
PDF Oxide 为所有主流生态系统提供原生绑定:Rust, Python, WASM, C#, Golang, Java, PHP, Ruby, C++, Swift, Kotlin, Dart, R, Julia, Zig, Scala, Clojure, Objective-C, Elixir。
下一步
- 类型与枚举 — 所有共享类型与枚举
- Page API 参考 — 各绑定间一致的逐页迭代方式
- Node.js 快速上手 — 教程