Skip to content

R API 레퍼런스

PDF Oxide는 pdfoxide 패키지로 관용적인 R 바인딩을 제공합니다. 이 패키지는 R의 네이티브 .Call 인터페이스를 통해 pdf_oxide C ABI를 감싸므로, Java나 Python 같은 외부 런타임 의존성 없이 컴파일된 공유 라이브러리만 있으면 됩니다.

# install from source (requires a C toolchain)
R CMD INSTALL pdfoxide

library(pdfoxide)

Rust API는 Rust API 레퍼런스를, Python은 Python API 레퍼런스를, JavaScript는 Node.js API 레퍼런스 또는 WASM API 레퍼런스를 참고하세요.

R API는 R6/S4 클래스가 아니라 불투명 핸들 객체를 중심으로 구성된 플랫 함수형 API입니다. 주요 핸들 타입은 다음과 같습니다.

핸들 생성 함수 용도
pdfoxide_pdf pdf_from_markdown(), pdf_from_html(), … 새로 만든 PDF로, 저장하거나 변환할 준비가 된 상태
pdfoxide_document pdf_open(), pdf_open_from_bytes() 읽기 전용 추출·렌더링용으로 불러온 PDF
pdfoxide_editor pdf_editor_open() 편집·병합·저장이 가능한 변경 가능 PDF
pdfoxide_builder pdf_builder_create() 프로그래밍 방식 페이지 구성을 위한 DocumentBuilder
pdfoxide_page (builder) pdf_builder_page(), pdf_builder_a4_page(), … 플루언트 방식으로 레이아웃을 잡는 페이지
pdfoxide_page (lazy) pdf_page() 단일 페이지에 대한 지연(lazy) 읽기 핸들
pdfoxide_renderer, pdfoxide_rendered_image 렌더링 함수 재사용 가능한 렌더러와 렌더링된 래스터 출력
pdfoxide_certificate, pdfoxide_signature, pdfoxide_timestamp, pdfoxide_tsa_client, pdfoxide_dss 서명/검증 함수 전자서명 관련 기본 요소

모든 페이지 인덱스는 0부터 시작합니다. 빌더/페이지/에디터를 변경하는 함수는 핸들을 보이지 않게(invisibly) 반환하므로 파이프(|>)로 호출을 연결할 수 있습니다. 핸들은 가비지 컬렉션 시 자동으로 닫히지만, pdf_close() / *_close()로 즉시 해제할 수도 있습니다.


PDF 생성

소스 포맷으로부터 즉시 PDF를 만드는 함수들입니다. 각각 pdfoxide_pdf 핸들을 반환합니다.

pdf_from_markdown(markdown)                              # build a PDF from a Markdown string
pdf_from_html(html)                                      # build a PDF from an HTML string
pdf_from_text(text)                                      # build a PDF from plain text
pdf_from_image(path)                                     # build a single-page PDF from an image file
pdf_from_image_bytes(bytes)                              # build a single-page PDF from raw image bytes
pdf_from_html_css(html, css, font_bytes = NULL)          # build a PDF from HTML + CSS (optional embedded font)
pdf_from_html_css_with_fonts(html, css, families, font_bytes)  # HTML + CSS with multiple named font families
pdf_merge(paths)                                         # merge several PDF files into one new PDF

생성한 PDF 저장/직렬화

pdf_save(pdf, path)            # write the PDF to a file path
pdf_to_bytes(pdf)             # serialize the PDF to a raw vector
pdf_get_page_count(pdf)       # number of pages in a built pdfoxide_pdf

문서 열기

추출과 렌더링을 위해 기존 PDF를 엽니다. pdfoxide_document를 반환합니다.

pdf_open(path)                          # open a PDF file from disk
pdf_open_with_password(path, password)  # open an encrypted PDF with a password
pdf_open_from_bytes(bytes)              # open a PDF from an in-memory raw vector
pdf_close(x)                            # close any pdfoxide handle and free it

Office 포맷에서 열기

Word/PowerPoint/Excel 문서를 변환하여 곧바로 pdfoxide_document로 엽니다.

pdf_open_from_docx_bytes(bytes)   # convert DOCX bytes and open as a document
pdf_open_from_pptx_bytes(bytes)   # convert PPTX bytes and open as a document
pdf_open_from_xlsx_bytes(bytes)   # convert XLSX bytes and open as a document

문서 검사

pdf_page_count(doc)            # number of pages
pdf_version(doc)               # PDF version as a list(major, minor)
pdf_is_encrypted(doc)          # TRUE if the document is encrypted
pdf_has_structure_tree(doc)    # TRUE if the document is a Tagged PDF
pdf_authenticate(doc, password)  # authenticate an encrypted document after opening
pdf_has_xfa(doc)               # TRUE if the document contains XFA forms
pdf_has_timestamp(doc)         # TRUE if the document carries a document timestamp

텍스트 및 콘텐츠 추출

단일 페이지 추출(페이지 인덱스는 0부터 시작).

pdf_extract_text(doc, page)              # reading-order plain text for one page
pdf_to_plain_text(doc, page)             # layout-aware plain text for one page
pdf_to_markdown(doc, page)               # Markdown for one page
pdf_to_html(doc, page)                   # HTML for one page
pdf_extract_structured_json(doc, page)   # structured layout JSON for one page

문서 전체 추출.

pdf_to_markdown_all(doc)      # Markdown for the entire document
pdf_to_html_all(doc)          # HTML for the entire document
pdf_to_plain_text_all(doc)    # plain text for the entire document
pdf_extract_all_text(doc)     # concatenated reading-order text for all pages

구조화/요소 단위 추출. 데이터 프레임 또는 레코드 리스트를 반환합니다.

pdf_extract_chars(doc, page)        # per-character records (glyph, bbox, font, size, color)
pdf_extract_words(doc, page)        # word records with bounding boxes
pdf_extract_text_lines(doc, page)   # text-line records with bounding boxes
pdf_extract_tables(doc, page)       # detected tables with rows and cells
pdf_extract_paths(doc, page)        # vector path (line/curve/shape) records
pdf_embedded_fonts(doc, page)       # embedded font records used on a page
pdf_embedded_images(doc, page)      # embedded image records on a page
pdf_page_annotations(doc, page)     # annotation records on a page

자동 감지 추출(네이티브 방식과 OCR 방식 휴리스틱 중 적절한 쪽을 선택).

pdf_extract_text_auto(doc, page)                  # best-effort text for one page
pdf_extract_page_auto(doc, page, options_json = NULL)  # best-effort structured page extraction

영역(클립 사각형) 추출

PDF 포인트 단위(원점은 왼쪽 아래)로 지정한 사각형 안으로 추출 범위를 제한합니다.

pdf_extract_text_in_rect(doc, page, x, y, width, height)    # text inside a rectangle
pdf_extract_words_in_rect(doc, page, x, y, width, height)   # words inside a rectangle
pdf_extract_lines_in_rect(doc, page, x, y, width, height)   # lines inside a rectangle
pdf_extract_tables_in_rect(doc, page, x, y, width, height)  # tables inside a rectangle
pdf_extract_images_in_rect(doc, page, x, y, width, height)  # images inside a rectangle

지연(lazy) 페이지 핸들

pdf_page()는 단일 페이지에 바인딩된 가벼운 pdfoxide_page를 반환하며, 텍스트 getter는 호출 시점에 추출을 수행합니다.

pdf_page(doc, index)        # lazy handle for one page
pdf_page_text(page)         # plain text of the page
pdf_page_markdown(page)     # Markdown of the page
pdf_page_html(page)         # HTML of the page
pdf_page_plain_text(page)   # layout-aware plain text of the page

페이지 기하 정보 및 원시 요소

pdf_page_get_width(doc, page)      # page width in PDF points
pdf_page_get_height(doc, page)     # page height in PDF points
pdf_page_get_rotation(doc, page)   # page rotation in degrees (0/90/180/270)
pdf_page_get_elements(doc, page)   # raw element records for the page

검색

pdf_search(doc, page, term, case_sensitive = FALSE)        # search one page
pdf_search_all(doc, term, case_sensitive = FALSE)          # search the whole document
pdf_search_results_to_json(doc, page, term, case_sensitive = FALSE)  # page search results as JSON

페이지 분류 및 정리

반복되는 머리글, 바닥글, 아티팩트를 감지하여 제거합니다.

pdf_classify_page(doc, page)              # classify the layout/content of one page
pdf_classify_document(doc)                # classify the whole document
pdf_remove_headers(doc, threshold = 0.5)  # detect and remove repeating headers
pdf_remove_footers(doc, threshold = 0.5)  # detect and remove repeating footers
pdf_remove_artifacts(doc, threshold = 0.5)  # detect and remove page artifacts
pdf_erase_header(doc, page)               # erase the header region on a page
pdf_erase_footer(doc, page)               # erase the footer region on a page
pdf_erase_artifacts(doc, page)            # erase artifact regions on a page

Office 변환(내보내기)

불러온 PDF를 다시 Office 포맷으로 변환합니다. raw 벡터를 반환합니다.

pdf_to_docx(doc)   # convert the document to DOCX bytes
pdf_to_pptx(doc)   # convert the document to PPTX bytes
pdf_to_xlsx(doc)   # convert the document to XLSX bytes

pdf_get_form_fields(doc)                          # list of form-field records
pdf_export_form_data_to_bytes(doc, format_type = 0L)  # export form data (0 = FDF, 1 = XFDF) to bytes
pdf_import_form_data(doc, data_path)              # import form data from a file path
pdf_form_import_from_file(doc, filename)          # import form data from a named file

에디터 측 폼 헬퍼는 PDF 편집 항목에 정리되어 있습니다.


문서 구조 및 메타데이터

pdf_get_outline(doc)        # document outline / bookmarks tree
pdf_get_page_labels(doc)    # page-label ranges
pdf_get_xmp_metadata(doc)   # XMP metadata as a list
pdf_get_source_bytes(doc)   # the original source bytes of the document
pdf_plan_split_by_bookmarks(doc, options_json = NULL)  # plan a split of the document by top-level bookmarks

주석(annotation) 상세 정보

페이지와 인덱스로 개별 주석을 조회합니다.

pdf_annotation_get_color(doc, page, index)              # annotation RGB color
pdf_annotation_get_creation_date(doc, page, index)      # creation date string
pdf_annotation_get_modification_date(doc, page, index)  # modification date string
pdf_annotation_is_hidden(doc, page, index)              # TRUE if the annotation is hidden
pdf_annotation_is_marked_deleted(doc, page, index)      # TRUE if marked deleted
pdf_annotation_is_printable(doc, page, index)           # TRUE if the annotation prints
pdf_annotation_is_read_only(doc, page, index)           # TRUE if read-only
pdf_link_annotation_get_uri(doc, page, index)           # URI of a link annotation
pdf_text_annotation_get_icon_name(doc, page, index)     # icon name of a text annotation
pdf_highlight_annotation_quad_points_count(doc, page, index)        # number of highlight quad points
pdf_highlight_annotation_quad_point(doc, page, index, quad_index)   # one highlight quad point
pdf_annotations_to_json(doc, page)                      # all annotations on a page as JSON

폰트 및 요소 JSON 헬퍼

pdf_font_get_size(doc, page, index)   # size of a font record on a page
pdf_fonts_to_json(doc, page)          # page fonts as JSON
pdf_elements_to_json(doc, page)       # page elements as JSON

렌더링

페이지를 래스터 이미지로 렌더링합니다. format: 0 = PNG, 1 = JPEG. 좌표와 DPI는 각 함수별로 문서화되어 있습니다.

pdf_render_page(doc, page, format = 0L)                 # render a page at default DPI
pdf_render_page_zoom(doc, page, zoom, format = 0L)      # render a page at a zoom factor
pdf_render_page_thumbnail(doc, page, size, format = 0L) # render a fitted thumbnail
pdf_render_page_fit(doc, page, w, h, format = 0L)       # render fitted into w x h pixels
pdf_render_page_raw(doc, page, dpi = 150L)              # render to a raw RGBA buffer
pdf_render_page_region(doc, page, crop_x, crop_y, crop_width, crop_height, format = 0L)  # render a sub-region

RenderOptions의 전체 옵션(배경 RGBA, 투명도, 주석 표시 여부, JPEG 품질, 레이어 제외).

pdf_render_page_with_options(doc, page, dpi = 150L, format = 0L,
                             bg_r = 1, bg_g = 1, bg_b = 1, bg_a = 1,
                             transparent_background = FALSE,
                             render_annotations = TRUE, jpeg_quality = 85L)

pdf_render_page_with_options_ex(doc, page, dpi = 150L, format = 0L,
                                bg_r = 1, bg_g = 1, bg_b = 1, bg_a = 1,
                                transparent_background = FALSE,
                                render_annotations = TRUE, jpeg_quality = 85L,
                                excluded_layers = NULL)

재사용 가능한 렌더러와 렌더링 시간 추정.

pdf_create_renderer(dpi = 150L, format = 0L, quality = 85L, anti_alias = TRUE)  # build a reusable renderer
pdf_renderer_close(renderer)            # free a renderer
pdf_estimate_render_time(doc, page)     # estimate render time for a page

렌더링된 이미지 핸들 관련 헬퍼.

pdf_rendered_image_save(image, path)    # write a rendered image to a file
pdf_rendered_image_close(image)         # free a rendered image

PDF 편집

변경을 위해 PDF를 엽니다. pdfoxide_editor를 반환합니다.

pdf_editor_open(path)               # open a PDF for editing
pdf_editor_open_from_bytes(bytes)   # open an editor from a raw vector
pdf_editor_close(editor)            # close the editor and free it

에디터 검사 및 메타데이터

pdf_editor_page_count(editor)               # page count
pdf_editor_version(editor)                  # PDF version as list(major, minor)
pdf_editor_is_modified(editor)              # TRUE if the editor has unsaved changes
pdf_editor_source_path(editor)              # original source path, if any
pdf_editor_get_producer(editor)             # Producer metadata string
pdf_editor_set_producer(editor, value)      # set the Producer metadata
pdf_editor_get_creation_date(editor)        # CreationDate string
pdf_editor_set_creation_date(editor, value) # set the CreationDate

페이지 작업

pdf_editor_delete_page(editor, page)               # delete a page
pdf_editor_move_page(editor, from, to)             # move a page to a new index
pdf_editor_rotate_page_by(editor, page, degrees)   # rotate a page by a relative angle
pdf_editor_rotate_all_pages(editor, degrees)       # rotate every page
pdf_editor_get_page_rotation(editor, page)         # current page rotation
pdf_editor_set_page_rotation(editor, page, degrees)  # set absolute page rotation
pdf_editor_crop_margins(editor, left, right, top, bottom)  # crop margins on all pages
pdf_editor_get_page_crop_box(editor, page)         # get CropBox as c(x, y, w, h)
pdf_editor_set_page_crop_box(editor, page, x, y, w, h)  # set CropBox
pdf_editor_get_page_media_box(editor, page)        # get MediaBox as c(x, y, w, h)
pdf_editor_set_page_media_box(editor, page, x, y, w, h) # set MediaBox

마스킹(redaction, 에디터)

pdf_editor_apply_all_redactions(editor)                  # apply all pending redactions
pdf_editor_apply_page_redactions(editor, page)           # apply redactions on one page
pdf_editor_is_page_marked_for_redaction(editor, page)    # TRUE if page has pending redactions
pdf_editor_unmark_page_for_redaction(editor, page)       # clear pending redactions on a page
pdf_editor_erase_region(editor, page, x, y, w, h)        # erase a rectangle on a page
pdf_editor_erase_regions(editor, page, rects)            # erase several rectangles on a page
pdf_editor_clear_erase_regions(editor, page)             # clear pending erase regions

독립 실행형 마스킹(redaction) 워크플로

pdf_redaction_add(editor, page, x1, y1, x2, y2, r = 0, g = 0, b = 0)  # add a redaction box with a fill color
pdf_redaction_count(editor, page)                                    # pending redaction count on a page
pdf_redaction_apply(editor, scrub_metadata = FALSE, r = 0, g = 0, b = 0)  # burn in all redactions
pdf_redaction_scrub_metadata(editor)                                 # scrub metadata for redaction hygiene

폼 및 주석(에디터)

pdf_editor_flatten_forms(editor)                       # flatten all form fields into content
pdf_editor_flatten_forms_on_page(editor, page)         # flatten forms on one page
pdf_editor_set_form_field_value(editor, name, value)   # set a form-field value by name
pdf_editor_flatten_annotations(editor, page)           # flatten annotations on a page
pdf_editor_flatten_all_annotations(editor)             # flatten all annotations
pdf_editor_flatten_warnings_count(editor)              # number of flatten warnings
pdf_editor_flatten_warning(editor, index)              # one flatten warning message
pdf_editor_is_page_marked_for_flatten(editor, page)    # TRUE if page is marked for flatten
pdf_editor_unmark_page_for_flatten(editor, page)       # clear flatten mark on a page
pdf_editor_import_fdf_bytes(editor, bytes)             # import FDF form data
pdf_editor_import_xfdf_bytes(editor, bytes)            # import XFDF form data

문서 작업(에디터)

pdf_editor_merge_from(editor, source_path)             # append pages from another PDF file
pdf_editor_merge_from_bytes(editor, bytes)             # append pages from PDF bytes
pdf_editor_convert_to_pdf_a(editor, level)             # convert in place to PDF/A
pdf_editor_embed_file(editor, name, bytes)             # attach an embedded file
pdf_editor_extract_pages_to_bytes(editor, pages)       # extract selected pages to a new PDF (bytes)

저장(에디터)

pdf_editor_save(editor, path)                          # save to a file
pdf_editor_save_to_bytes(editor)                       # save to a raw vector
pdf_editor_save_to_bytes_with_options(editor, compress = TRUE,
                                      garbage_collect = TRUE, linearize = FALSE)  # save with options
pdf_editor_save_encrypted(editor, path, user_password, owner_password)            # save AES-encrypted to a file
pdf_editor_save_encrypted_to_bytes(editor, user_password, owner_password)         # save AES-encrypted to bytes

DocumentBuilder(프로그래밍 방식 생성)

PDF를 페이지 단위로 구성합니다. pdf_builder_create()pdfoxide_builder를 반환하고, 페이지 생성자는 플루언트 방식의 pdfoxide_page를 반환합니다.

pdf_builder_create()         # start a new DocumentBuilder
pdf_builder_close(builder)   # free a builder

빌더 문서 메타데이터

pdf_builder_set_title(builder, value)      # set document title
pdf_builder_set_author(builder, value)     # set document author
pdf_builder_set_subject(builder, value)    # set document subject
pdf_builder_set_keywords(builder, value)   # set document keywords
pdf_builder_set_creator(builder, value)    # set document creator
pdf_builder_on_open(builder, script)       # set a document-open JavaScript action
pdf_builder_language(builder, lang)        # set the document language (e.g. "en-US")
pdf_builder_tagged_pdf_ua1(builder)        # enable Tagged PDF / PDF-UA-1 output
pdf_builder_role_map(builder, custom, standard)        # map a custom structure tag to a standard role
pdf_builder_register_embedded_font(builder, name, font)  # register an embedded font for use on pages

빌더 페이지 및 출력

pdf_builder_page(builder, width, height)   # start a custom-size page
pdf_builder_a4_page(builder)               # start an A4 page
pdf_builder_letter_page(builder)           # start a US Letter page
pdf_builder_build(builder)                 # finish and return the PDF as bytes
pdf_builder_save(builder, path)            # finish and write to a file
pdf_builder_save_encrypted(builder, path, user_password, owner_password)     # finish and write AES-encrypted
pdf_builder_to_bytes_encrypted(builder, user_password, owner_password)       # finish and return encrypted bytes

임베디드 폰트

pdf_embedded_font_from_file(path)                 # load an embedded font from a TTF/OTF file
pdf_embedded_font_from_bytes(bytes, name = NULL)  # load an embedded font from bytes
pdf_embedded_font_close(font)                     # free an embedded font handle

페이지 빌더(플루언트 레이아웃)

아래 함수들은 모두 pdf_builder_page()가 반환한 pdfoxide_page에 대해 동작하며, 체이닝을 위해 페이지를 보이지 않게(invisibly) 반환합니다. 페이지 작업은 pdf_page_done()으로 마무리합니다.

텍스트 흐름 및 타이포그래피

pdf_page_font(page, name, size)        # set the active font and size
pdf_page_at(page, x, y)                # move the text cursor to a coordinate
pdf_page_builder_text(page, text)      # draw text at the cursor
pdf_page_heading(page, level, text)    # add a heading (level 1-6)
pdf_page_paragraph(page, text)         # add a wrapped paragraph
pdf_page_space(page, points)           # add vertical space
pdf_page_horizontal_rule(page)         # draw a horizontal rule
pdf_page_newline(page)                 # advance to the next line
pdf_page_footnote(page, ref_mark, note_text)        # add a footnote with a reference mark
pdf_page_columns(page, column_count, gap_pt, text)  # flow text into multiple columns
pdf_page_text_in_rect(page, x, y, w, h, text, align = 0L)  # flow text inside a rectangle
pdf_page_new_page_same_size(page)      # start a new page of the same size
pdf_page_done(page)                    # finish the page and return to the builder
pdf_page_close(page)                   # free a page handle

인라인 스타일 런(run)

pdf_page_inline(page, text)               # append an inline text run
pdf_page_inline_bold(page, text)          # append a bold inline run
pdf_page_inline_italic(page, text)        # append an italic inline run
pdf_page_inline_color(page, r, g, b, text)  # append a colored inline run

링크 및 JavaScript 액션

pdf_page_link_url(page, url)              # add a URL link
pdf_page_link_page(page, index)           # add an internal page link
pdf_page_link_named(page, destination)    # add a named-destination link
pdf_page_link_javascript(page, script)    # add a JavaScript-action link
pdf_page_on_open(page, script)            # page-open JavaScript action
pdf_page_on_close(page, script)           # page-close JavaScript action
pdf_page_field_keystroke(page, script)    # field keystroke JavaScript action
pdf_page_field_format(page, script)       # field format JavaScript action
pdf_page_field_validate(page, script)     # field validate JavaScript action
pdf_page_field_calculate(page, script)    # field calculate JavaScript action

주석 및 마크업

pdf_page_highlight(page, r, g, b)         # highlight markup at the current run
pdf_page_underline(page, r, g, b)         # underline markup
pdf_page_strikeout(page, r, g, b)         # strikeout markup
pdf_page_squiggly(page, r, g, b)          # squiggly underline markup
pdf_page_sticky_note(page, text)          # sticky note at the cursor
pdf_page_sticky_note_at(page, x, y, text) # sticky note at a coordinate
pdf_page_watermark(page, text)            # add a text watermark
pdf_page_watermark_confidential(page)     # add a CONFIDENTIAL watermark
pdf_page_watermark_draft(page)            # add a DRAFT watermark
pdf_page_stamp(page, type_name)           # add a rubber stamp (e.g. "Approved")
pdf_page_freetext(page, x, y, w, h, text) # add a free-text annotation

AcroForm 위젯

pdf_page_text_field(page, name, x, y, w, h, default_value = NULL)        # text field
pdf_page_checkbox(page, name, x, y, w, h, checked = FALSE)               # checkbox
pdf_page_combo_box(page, name, x, y, w, h, options, selected = NULL)     # combo box
pdf_page_radio_group(page, name, values, xs, ys, ws, hs, selected = NULL)  # radio-button group
pdf_page_push_button(page, name, x, y, w, h, caption)                    # push button
pdf_page_signature_field(page, name, x, y, w, h)                         # signature field

바코드(페이지 빌더)

pdf_page_barcode_1d(page, barcode_type, data, x, y, w, h)  # draw a 1D barcode
pdf_page_barcode_qr(page, data, x, y, size)                # draw a QR code

이미지

pdf_page_image(page, bytes, x, y, w, h)                  # place an image
pdf_page_image_with_alt(page, bytes, x, y, w, h, alt_text)  # place an image with alt text
pdf_page_image_artifact(page, bytes, x, y, w, h)         # place an image tagged as an artifact

벡터 그래픽

pdf_page_rect(page, x, y, w, h)                          # draw a rectangle outline
pdf_page_filled_rect(page, x, y, w, h, r, g, b)          # draw a filled rectangle
pdf_page_line(page, x1, y1, x2, y2)                      # draw a line
pdf_page_stroke_rect(page, x, y, w, h, width, r, g, b)   # stroke a rectangle with width and color
pdf_page_stroke_line(page, x1, y1, x2, y2, width, r, g, b)  # stroke a line with width and color
pdf_page_stroke_rect_dashed(page, x, y, w, h, width, r, g, b, dash = numeric(0), phase = 0)    # dashed rectangle
pdf_page_stroke_line_dashed(page, x1, y1, x2, y2, width, r, g, b, dash = numeric(0), phase = 0)  # dashed line

pdf_page_table(page, widths, aligns, cells, has_header = FALSE,
               n_columns = length(widths), n_rows = NULL)  # render a static table

대용량/점진적 데이터를 위한 스트리밍 표.

pdf_page_streaming_table_begin(page, headers, widths, aligns,
                               repeat_header = FALSE, n_columns = length(headers))  # begin a streaming table
pdf_page_streaming_table_begin_v2(page, headers, widths, aligns,
                                  repeat_header = FALSE, mode = 0L, sample_rows = 0L,
                                  min_col_width_pt = 0, max_col_width_pt = 0,
                                  max_rowspan = 0L, n_columns = length(headers))  # streaming table with autosize/rowspan
pdf_page_streaming_table_set_batch_size(page, batch_size)      # set the flush batch size
pdf_page_streaming_table_pending_row_count(page)               # rows buffered but not yet flushed
pdf_page_streaming_table_batch_count(page)                     # number of flushed batches
pdf_page_streaming_table_flush(page)                           # flush buffered rows
pdf_page_streaming_table_push_row(page, cells)                 # push one row of cells
pdf_page_streaming_table_push_row_v2(page, cells, rowspans = NULL)  # push a row with per-cell rowspans
pdf_page_streaming_table_finish(page)                          # finish and lay out the streaming table

전자서명

인증서

pdf_certificate_load_from_bytes(bytes, password = NULL)  # load a PKCS#12 / DER certificate from bytes
pdf_certificate_load_from_pem(cert_pem, key_pem)         # load a certificate + key from PEM
pdf_certificate_subject(cert)    # certificate subject DN
pdf_certificate_issuer(cert)     # certificate issuer DN
pdf_certificate_serial(cert)     # certificate serial number
pdf_certificate_validity(cert)   # validity window
pdf_certificate_is_valid(cert)   # TRUE if currently within the validity window
pdf_certificate_close(cert)      # free a certificate handle

서명

pdf_sign_bytes(pdf, cert, reason = NULL, location = NULL)            # sign PDF bytes (basic CMS signature)
pdf_sign_bytes_pades(pdf, cert, level = 0L, tsa_url = NULL, ...)     # sign PDF bytes with a PAdES profile
pdf_sign_bytes_pades_opts(pdf, cert, level = 0L, tsa_url = NULL, ...)  # PAdES signing with extended options
pdf_sign(doc, certificate, reason = NULL, location = NULL)          # sign a loaded document
pdf_add_timestamp(pdf_data, sig_index, tsa_url)                     # add a TSA timestamp to a signature in bytes

서명 검사 및 검증

pdf_signature_count(doc)                  # number of signatures
pdf_get_signature(doc, index)             # signature handle by index
pdf_signature_signer_name(sig)            # signer common name
pdf_signature_signing_reason(sig)         # signing reason
pdf_signature_signing_location(sig)       # signing location
pdf_signature_signing_time(sig)           # signing time
pdf_signature_certificate(sig)            # signer certificate handle
pdf_signature_pades_level(sig)            # PAdES level of the signature
pdf_signature_has_timestamp(sig)          # TRUE if the signature is timestamped
pdf_signature_timestamp(sig)              # embedded timestamp handle
pdf_signature_add_timestamp(sig, timestamp)  # attach a timestamp to a signature
pdf_signature_verify(sig)                 # verify the signature, returns a status
pdf_signature_verify_detached(sig, pdf)   # verify with a detached message digest check
pdf_signature_close(sig)                  # free a signature handle
pdf_verify_all_signatures(doc)            # verify every signature in the document

타임스탬프

pdf_timestamp_parse(bytes)               # parse a timestamp token (TST)
pdf_timestamp_token(timestamp)           # raw timestamp token bytes
pdf_timestamp_message_imprint(timestamp) # message imprint of the timestamp
pdf_timestamp_time(timestamp)            # timestamp time
pdf_timestamp_serial(timestamp)          # timestamp serial number
pdf_timestamp_tsa_name(timestamp)        # TSA name
pdf_timestamp_policy_oid(timestamp)      # timestamp policy OID
pdf_timestamp_hash_algorithm(timestamp)  # hash algorithm used
pdf_timestamp_verify(timestamp)          # verify the timestamp token
pdf_timestamp_close(timestamp)           # free a timestamp handle

TSA 클라이언트

pdf_tsa_client_create(url, username = NULL, password = NULL, timeout = 30L,
                      hash_algo = 0L, use_nonce = TRUE, cert_req = TRUE)  # create a TSA client
pdf_tsa_request_timestamp(client, data)              # request a timestamp over data
pdf_tsa_request_timestamp_hash(client, hash, hash_algo = 0L)  # request a timestamp over a precomputed hash
pdf_tsa_client_close(client)                         # free a TSA client

문서 보안 저장소(DSS)

pdf_get_dss(doc)              # get the document's DSS handle
pdf_dss_cert_count(dss)       # number of certificates in the DSS
pdf_dss_crl_count(dss)        # number of CRLs
pdf_dss_ocsp_count(dss)       # number of OCSP responses
pdf_dss_vri_count(dss)        # number of VRI entries
pdf_dss_get_cert(dss, index)  # one DSS certificate
pdf_dss_get_crl(dss, index)   # one DSS CRL
pdf_dss_get_ocsp(dss, index)  # one DSS OCSP response
pdf_dss_close(dss)            # free a DSS handle

표준 준수 검증

PDF/A

pdf_validate_pdf_a(doc, level = 0L)   # validate against a PDF/A level, returns a results handle
pdf_a_is_compliant(results)           # TRUE if compliant
pdf_a_errors(results)                 # list of validation errors
pdf_a_warning_count(results)          # number of warnings
pdf_a_results_close(results)          # free the results handle
pdf_convert_to_pdf_a(doc, level = 2L) # convert a document to PDF/A bytes

PDF/UA(접근성)

pdf_validate_pdf_ua(doc, level = 0L)  # validate against PDF/UA, returns a results handle
pdf_ua_is_accessible(results)         # TRUE if accessible
pdf_ua_errors(results)                # list of accessibility errors
pdf_ua_warnings(results)              # list of accessibility warnings
pdf_ua_stats(results)                 # accessibility statistics
pdf_ua_results_close(results)         # free the results handle

PDF/X(인쇄)

pdf_validate_pdf_x(doc, level = 0L)   # validate against PDF/X, returns a results handle
pdf_x_is_compliant(results)           # TRUE if compliant
pdf_x_errors(results)                 # list of validation errors
pdf_x_results_close(results)          # free the results handle

바코드

독립 실행형 바코드 생성 및 디코딩.

pdf_generate_qr_code(data, error_correction = 1L, size_px = 256L)  # generate a QR code, returns a barcode handle
pdf_generate_barcode(data, format = 0L, size_px = 256L)            # generate a barcode in a given format
pdf_barcode_get_data(barcode)             # decoded data string
pdf_barcode_get_format(barcode)           # barcode format
pdf_barcode_get_confidence(barcode)       # decode confidence
pdf_barcode_get_image_png(barcode, size_px = 256L)  # rendered PNG bytes
pdf_barcode_get_svg(barcode, size_px = 256L)        # rendered SVG string
pdf_barcode_close(barcode)                # free a barcode handle
pdf_editor_add_barcode_to_page(editor, page, barcode, x, y, width, ...)  # stamp a barcode onto an editor page

OCR

기반 빌드에 ocr 기능이 포함되어 있어야 합니다.

pdf_ocr_engine_create(det_model_path, rec_model_path, dict_path)  # build an OCR engine from model paths
pdf_ocr_engine_close(engine)             # free an OCR engine
pdf_ocr_page_needs_ocr(doc, page)        # TRUE if a page has no extractable text layer
pdf_ocr_extract_text(doc, page, engine = NULL)  # OCR a page (uses the default engine when NULL)

OCR 모델 및 런타임 설정

pdf_model_manifest()                       # available OCR model manifest
pdf_prefetch_available()                   # TRUE if model prefetching is available
pdf_prefetch_models(languages_csv = NULL)  # prefetch OCR models for given languages
pdf_set_max_ops_per_stream(limit)          # cap content-stream operations (DoS guard)
pdf_set_preserve_unmapped_glyphs(preserve) # keep glyphs with no Unicode mapping

암호화 프로바이더 / FIPS

pdf_crypto_active_provider()   # name of the active crypto provider
pdf_crypto_fips_available()    # TRUE if a FIPS provider is available
pdf_crypto_use_fips()          # switch to the FIPS provider
pdf_crypto_set_policy(spec)    # set the crypto policy from a spec string
pdf_crypto_policy()            # current crypto policy
pdf_crypto_inventory()         # cryptographic algorithm inventory
pdf_crypto_cbom()              # Cryptographic Bill of Materials (CBOM)

로깅

pdf_set_log_level(level)   # set the library log level
pdf_get_log_level()        # get the current log level

전체 예제

library(pdfoxide)

# --- Create ---
pdf <- pdf_from_markdown("# Report\n\nGenerated by **PDF Oxide**.\n")
pdf_save(pdf, "report.pdf")
pdf_close(pdf)

# --- Extract ---
doc <- pdf_open("report.pdf")
cat("Pages:", pdf_page_count(doc), "\n")

for (i in seq_len(pdf_page_count(doc)) - 1L) {     # 0-based indices
  txt <- pdf_extract_text(doc, i)
  cat(sprintf("Page %d: %d characters\n", i + 1L, nchar(txt)))
}

chars <- pdf_extract_chars(doc, 0)                 # per-character data frame
results <- pdf_search_all(doc, "PDF Oxide", case_sensitive = FALSE)
pdf_close(doc)

# --- Edit ---
ed <- pdf_editor_open("report.pdf")
pdf_editor_set_producer(ed, "PDF Oxide")
pdf_editor_rotate_all_pages(ed, 90L)
pdf_editor_save(ed, "rotated.pdf")
pdf_editor_close(ed)

# --- Build programmatically ---
b <- pdf_builder_create()
pdf_builder_set_title(b, "Invoice")
page <- pdf_builder_letter_page(b)
pdf_page_font(page, "Helvetica", 24)
pdf_page_at(page, 72, 720)
pdf_page_builder_text(page, "Invoice #1001")
pdf_page_done(page)
pdf_builder_save(b, "invoice.pdf")
pdf_builder_close(b)

Other Language Bindings

PDF Oxide는 모든 주요 생태계를 위한 네이티브 바인딩을 제공합니다: Rust, Python, Node.js, WASM, C#, Golang, Java, PHP, Ruby, C++, Swift, Kotlin, Dart, Julia, Zig, Scala, Clojure, Objective-C, Elixir

다음 단계