C# / .NET API Reference
The PdfOxide NuGet package wraps the Rust core via LibraryImport-generated P/Invoke (all 881 declarations). NativeAOT-publish-ready and trim-safe. Target frameworks: net8.0, net10.0.
dotnet add package PdfOxide
using PdfOxide.Core;
For other languages see Python, Node.js, Go, or Rust.
Namespaces
using PdfOxide.Core; // PdfDocument, Pdf, DocumentEditor
using PdfOxide.Extensions; // LINQ-style extensions
using PdfOxide.Plugins; // extension points
All types implement IDisposable where appropriate — use using blocks or using declarations.
PdfDocument
Read-only access.
Factory methods
static PdfDocument Open(string path)
static PdfDocument Open(Stream stream)
static PdfDocument OpenFromBytes(ReadOnlySpan<byte> data)
static PdfDocument OpenWithPassword(string path, string password)
Properties
int PageCount { get; }
PdfVersion Version { get; } // struct { Major, Minor }
bool HasStructureTree { get; }
IReadOnlyList<PdfPage> Pages { get; } // v0.3.34
PdfPage this[int pageIndex] { get; } // v0.3.34
PdfPage (v0.3.34)
Lightweight per-page handle with full sync + async surface. Dispatches to the parent document.
public sealed class PdfPage
{
public int Index { get; }
public string ExtractText();
public Task<string> ExtractTextAsync(CancellationToken ct = default);
public string ToMarkdown();
public Task<string> ToMarkdownAsync(CancellationToken ct = default);
public string ToHtml();
public string ToPlainText();
public (string Text, float X, float Y, float W, float H)[] ExtractWords();
public IReadOnlyList<TextLine> ExtractTextLines();
public IReadOnlyList<Table> ExtractTables();
public IReadOnlyList<Char> ExtractChars();
public IReadOnlyList<ImageInfo> ExtractImages();
public IReadOnlyList<SearchResult> Search(string query, bool caseSensitive = false);
}
Text extraction
string ExtractText(int pageIndex)
Task<string> ExtractTextAsync(int pageIndex, CancellationToken ct = default)
string ExtractAllText()
Task<string> ExtractAllTextAsync(CancellationToken ct = default)
string ToMarkdown(int pageIndex)
string ToMarkdownAll()
string ToHtml(int pageIndex)
string ToHtmlAll()
string ToPlainText(int pageIndex)
Structured
IReadOnlyList<Word> ExtractWords(int pageIndex)
IReadOnlyList<TextLine> ExtractTextLines(int pageIndex)
IReadOnlyList<Char> ExtractChars(int pageIndex)
IReadOnlyList<Span> ExtractSpans(int pageIndex)
IReadOnlyList<Table> ExtractTables(int pageIndex)
IReadOnlyList<Path> ExtractPaths(int pageIndex)
Region-based
string ExtractTextInRect(int pageIndex, float x, float y, float width, float height)
IReadOnlyList<Word> ExtractWordsInRect(int pageIndex, float x, float y, float width, float height)
Images & resources
IReadOnlyList<ImageInfo> ExtractImages(int pageIndex)
IReadOnlyList<FontInfo> GetFonts(int pageIndex)
IReadOnlyList<AnnotationInfo> GetAnnotations(int pageIndex)
IReadOnlyList<FormField> GetFormFields()
PageInfo GetPageInfo(int pageIndex)
Search
IReadOnlyList<SearchResult> SearchPage(int pageIndex, string query, bool caseSensitive = false)
IReadOnlyList<SearchResult> SearchAll(string query, bool caseSensitive = false)
Pdf — creation
static Pdf FromMarkdown(string markdown)
static Pdf FromHtml(string html)
static Pdf FromText(string text)
static Pdf FromImage(string path)
static Pdf FromImageBytes(ReadOnlySpan<byte> data)
void Save(string path)
Task SaveAsync(string path, CancellationToken ct = default)
byte[] ToBytes()
DocumentEditor
static DocumentEditor Open(string path)
static DocumentEditor OpenFromBytes(ReadOnlySpan<byte> data)
// Metadata — properties are get/set
string? Title { get; set; }
string? Author { get; set; }
string? Subject { get; set; }
string? Keywords { get; set; }
int PageCount { get; }
void ApplyMetadata(Metadata metadata)
// Forms
void SetFormFieldValue(string name, string value)
void FlattenForms()
// Save
void Save(string path)
Task SaveAsync(string path, CancellationToken ct = default)
void SaveEncrypted(string path, string userPassword, string ownerPassword)
byte[] ToBytes()
Coverage note: the .NET binding currently exposes document open / read / convert / create, image extraction, form field read/fill/flatten, and metadata editing. Page operations, annotations, rendering, and signatures are available through the Rust core and other bindings; equivalent .NET surface will be added in a future release.
Extensions (LINQ support)
Exported from PdfOxide.Extensions:
IEnumerable<SearchResult> WhereOnPage(this IEnumerable<SearchResult> src, int page)
IEnumerable<IGrouping<int, SearchResult>> GroupByPage(this IEnumerable<SearchResult> src)
IEnumerable<Word> WithinRect(this IEnumerable<Word> src, float x, float y, float w, float h)
Use the existing IReadOnlyList<T> results with LINQ directly:
var hitsByPage = doc.SearchAll("keyword")
.GroupBy(r => r.Page)
.OrderBy(g => g.Key);
See extensions guide for the full list.
Plugins
Exposed under PdfOxide.Plugins — inject classifiers, post-processors, or validators into the extraction pipeline. See plugin guide.
Data types
public readonly record struct PdfVersion(int Major, int Minor);
public readonly record struct Char(
string Text, float X, float Y,
float FontSize, string FontName, Rect BBox);
public readonly record struct Span(
string Text, string FontName, float FontSize, Rect BBox);
public readonly record struct Word(
string Text, float X, float Y, float Width, float Height);
public readonly record struct TextLine(
string Text, float Y, IReadOnlyList<Span> Spans);
public readonly record struct SearchResult(
int Page, string Text, float X, float Y, float Width, float Height);
public readonly record struct ImageInfo(
int Width, int Height, string Format,
string Colorspace, int BitsPerComponent, byte[] Data);
public readonly record struct FontInfo(
string Name, string Type, string Encoding,
bool IsEmbedded, bool IsSubset, float Size);
public readonly record struct AnnotationInfo(
string Type, string Subtype, string Content,
float X, float Y, float Width, float Height,
string? Author, string? LinkUri);
public readonly record struct FormField(
string Name, string FieldType, string Value, int PageIndex);
public readonly record struct Rect(
float X, float Y, float Width, float Height);
public readonly record struct PageInfo(
float Width, float Height, int Rotation,
Rect MediaBox, Rect CropBox);
public sealed record Metadata(
string? Title = null,
string? Author = null,
string? Subject = null,
string? Keywords = null);
Exceptions
public class PdfOxideException : Exception
{
public int Code { get; }
public string NativeMessage { get; }
}
Thrown on any Rust-side failure. Wrap at system boundaries; interior code should propagate.
Standard .NET exceptions are raised for I/O (FileNotFoundException, UnauthorizedAccessException, etc.) and argument validation (ArgumentOutOfRangeException).
Thread safety
PdfDocumentread-only methods are thread-safe — use a single document across threads concurrently.DocumentEditoris not thread-safe for writes. UseReaderWriterLockSlimor serialize to one thread.Pdfcreation instances are not intended to be shared across threads.
See the concurrency guide for patterns.
Async pattern
Every I/O-bound or CPU-heavy method has an *Async variant accepting CancellationToken. See the async guide.
NativeAOT
Publish with -p:PublishAot=true. No extra configuration — all P/Invoke is source-generated, no reflection, no dynamic code.