Skip to content

C# / .NET API Reference

The PdfOxide NuGet package wraps the Rust core via LibraryImport-generated P/Invoke (all 881 declarations). NativeAOT-publish-ready and trim-safe. Target frameworks: net8.0, net10.0.

dotnet add package PdfOxide
using PdfOxide.Core;

For other languages see Python, Node.js, Go, or Rust.


Namespaces

using PdfOxide.Core;        // PdfDocument, Pdf, DocumentEditor
using PdfOxide.Extensions;  // LINQ-style extensions
using PdfOxide.Plugins;     // extension points

All types implement IDisposable where appropriate — use using blocks or using declarations.


PdfDocument

Read-only access.

Factory methods

static PdfDocument Open(string path)
static PdfDocument Open(Stream stream)
static PdfDocument OpenFromBytes(ReadOnlySpan<byte> data)
static PdfDocument OpenWithPassword(string path, string password)

Properties

int PageCount { get; }
PdfVersion Version { get; }  // struct { Major, Minor }
bool HasStructureTree { get; }
IReadOnlyList<PdfPage> Pages { get; }  // v0.3.34
PdfPage this[int pageIndex] { get; }   // v0.3.34

PdfPage (v0.3.34)

Lightweight per-page handle with full sync + async surface. Dispatches to the parent document.

public sealed class PdfPage
{
    public int Index { get; }

    public string ExtractText();
    public Task<string> ExtractTextAsync(CancellationToken ct = default);
    public string ToMarkdown();
    public Task<string> ToMarkdownAsync(CancellationToken ct = default);
    public string ToHtml();
    public string ToPlainText();

    public (string Text, float X, float Y, float W, float H)[] ExtractWords();
    public IReadOnlyList<TextLine> ExtractTextLines();
    public IReadOnlyList<Table> ExtractTables();
    public IReadOnlyList<Char> ExtractChars();
    public IReadOnlyList<ImageInfo> ExtractImages();
    public IReadOnlyList<SearchResult> Search(string query, bool caseSensitive = false);
}

Text extraction

string ExtractText(int pageIndex)
Task<string> ExtractTextAsync(int pageIndex, CancellationToken ct = default)
string ExtractAllText()
Task<string> ExtractAllTextAsync(CancellationToken ct = default)

string ToMarkdown(int pageIndex)
string ToMarkdownAll()
string ToHtml(int pageIndex)
string ToHtmlAll()
string ToPlainText(int pageIndex)

Structured

IReadOnlyList<Word> ExtractWords(int pageIndex)
IReadOnlyList<TextLine> ExtractTextLines(int pageIndex)
IReadOnlyList<Char> ExtractChars(int pageIndex)
IReadOnlyList<Span> ExtractSpans(int pageIndex)
IReadOnlyList<Table> ExtractTables(int pageIndex)
IReadOnlyList<Path> ExtractPaths(int pageIndex)

Region-based

string ExtractTextInRect(int pageIndex, float x, float y, float width, float height)
IReadOnlyList<Word> ExtractWordsInRect(int pageIndex, float x, float y, float width, float height)

Images & resources

IReadOnlyList<ImageInfo> ExtractImages(int pageIndex)
IReadOnlyList<FontInfo> GetFonts(int pageIndex)
IReadOnlyList<AnnotationInfo> GetAnnotations(int pageIndex)
IReadOnlyList<FormField> GetFormFields()
PageInfo GetPageInfo(int pageIndex)
IReadOnlyList<SearchResult> SearchPage(int pageIndex, string query, bool caseSensitive = false)
IReadOnlyList<SearchResult> SearchAll(string query, bool caseSensitive = false)

Pdf — creation

static Pdf FromMarkdown(string markdown)
static Pdf FromHtml(string html)
static Pdf FromText(string text)
static Pdf FromImage(string path)
static Pdf FromImageBytes(ReadOnlySpan<byte> data)

void Save(string path)
Task SaveAsync(string path, CancellationToken ct = default)
byte[] ToBytes()

DocumentEditor

static DocumentEditor Open(string path)
static DocumentEditor OpenFromBytes(ReadOnlySpan<byte> data)

// Metadata — properties are get/set
string? Title { get; set; }
string? Author { get; set; }
string? Subject { get; set; }
string? Keywords { get; set; }
int PageCount { get; }

void ApplyMetadata(Metadata metadata)

// Forms
void SetFormFieldValue(string name, string value)
void FlattenForms()

// Save
void Save(string path)
Task SaveAsync(string path, CancellationToken ct = default)
void SaveEncrypted(string path, string userPassword, string ownerPassword)
byte[] ToBytes()

Coverage note: the .NET binding currently exposes document open / read / convert / create, image extraction, form field read/fill/flatten, and metadata editing. Page operations, annotations, rendering, and signatures are available through the Rust core and other bindings; equivalent .NET surface will be added in a future release.


Extensions (LINQ support)

Exported from PdfOxide.Extensions:

IEnumerable<SearchResult> WhereOnPage(this IEnumerable<SearchResult> src, int page)
IEnumerable<IGrouping<int, SearchResult>> GroupByPage(this IEnumerable<SearchResult> src)
IEnumerable<Word> WithinRect(this IEnumerable<Word> src, float x, float y, float w, float h)

Use the existing IReadOnlyList<T> results with LINQ directly:

var hitsByPage = doc.SearchAll("keyword")
    .GroupBy(r => r.Page)
    .OrderBy(g => g.Key);

See extensions guide for the full list.


Plugins

Exposed under PdfOxide.Plugins — inject classifiers, post-processors, or validators into the extraction pipeline. See plugin guide.


Data types

public readonly record struct PdfVersion(int Major, int Minor);

public readonly record struct Char(
    string Text, float X, float Y,
    float FontSize, string FontName, Rect BBox);

public readonly record struct Span(
    string Text, string FontName, float FontSize, Rect BBox);

public readonly record struct Word(
    string Text, float X, float Y, float Width, float Height);

public readonly record struct TextLine(
    string Text, float Y, IReadOnlyList<Span> Spans);

public readonly record struct SearchResult(
    int Page, string Text, float X, float Y, float Width, float Height);

public readonly record struct ImageInfo(
    int Width, int Height, string Format,
    string Colorspace, int BitsPerComponent, byte[] Data);

public readonly record struct FontInfo(
    string Name, string Type, string Encoding,
    bool IsEmbedded, bool IsSubset, float Size);

public readonly record struct AnnotationInfo(
    string Type, string Subtype, string Content,
    float X, float Y, float Width, float Height,
    string? Author, string? LinkUri);

public readonly record struct FormField(
    string Name, string FieldType, string Value, int PageIndex);

public readonly record struct Rect(
    float X, float Y, float Width, float Height);

public readonly record struct PageInfo(
    float Width, float Height, int Rotation,
    Rect MediaBox, Rect CropBox);

public sealed record Metadata(
    string? Title = null,
    string? Author = null,
    string? Subject = null,
    string? Keywords = null);

Exceptions

public class PdfOxideException : Exception
{
    public int Code { get; }
    public string NativeMessage { get; }
}

Thrown on any Rust-side failure. Wrap at system boundaries; interior code should propagate.

Standard .NET exceptions are raised for I/O (FileNotFoundException, UnauthorizedAccessException, etc.) and argument validation (ArgumentOutOfRangeException).


Thread safety

  • PdfDocument read-only methods are thread-safe — use a single document across threads concurrently.
  • DocumentEditor is not thread-safe for writes. Use ReaderWriterLockSlim or serialize to one thread.
  • Pdf creation instances are not intended to be shared across threads.

See the concurrency guide for patterns.

Async pattern

Every I/O-bound or CPU-heavy method has an *Async variant accepting CancellationToken. See the async guide.

NativeAOT

Publish with -p:PublishAot=true. No extra configuration — all P/Invoke is source-generated, no reflection, no dynamic code.