Embedded Signing with the Foxit eSign API: From Envelope Creation to In-App iFrame in One Session

Diagram of embedded signing flow using the Foxit eSign API, from token request to iFrame rendering

This guide walks through the Foxit eSign API end to end: authenticate, create a folder, generate an embedded session URL, and render the signing experience in an iFrame your users never leave.

Most embedded signing tutorials hand you a three-step abstraction: get a token, create an envelope, open a recipient view. The implementation details live somewhere else, usually mapped to a different API’s object model that doesn’t quite match what you’re working with.

This tutorial covers the Foxit eSign API embedded signing mechanics from start to finish. Authenticate once, create a “folder” (Foxit’s term for what other platforms call an envelope), receive an embedded session URL in that same response, and render it in an iFrame your users never leave. No separate field-placement API call, no client-side SDK to install.

What You’ll Build

By the end of this guide, you’ll have a working embedded signing session: a PDF loaded into your app via iFrame, signature fields defined by Text Tags, a webhook handler that verifies completion events, and a post-signing redirect that keeps users inside your product. Every step runs against the Foxit eSign sandbox with credentials you can generate in under five minutes.

1. Prerequisites and Auth Setup

Before any code runs, make sure you have the following installed and configured:

Set up your workspace:

mkdir foxit-esign-tutorial && cd foxit-esign-tutorial

python3 -m venv .venv && source .venv/bin/activate

pip install requests

The Foxit eSign API Is a Separate Portal from PDF Services

Developers who already use Foxit tools hit this wall first: the Foxit eSign API runs at its own base host, with its own API Key and API Secret. These credentials don’t work with the client_id and client_secret from the PDF Services developer portal at developer-api.foxit.com. The eSign documentation lives at developersguide.foxitesign.foxit.com, separate from docs.developer-api.foxit.com. The API Playground handles PDF Services sandbox testing; eSign sandbox calls go to the eSign portal directly.

The NA environment base host is https://na1.foxitesign.foxit.com. All subsequent code samples reference this as {HOST_NAME}.

Generating an OAuth 2.0 Access Token

Set your credentials as environment variables before running anything:

export FOXIT_ESIGN_CLIENT_ID="your_api_key"

export FOXIT_ESIGN_CLIENT_SECRET="your_api_secret"

Then generate a Bearer token via the OAuth 2.0 client credentials flow:

curl -X POST "https://na1.foxitesign.foxit.com/api/oauth2/access_token" \

-d "client_id=$FOXIT_ESIGN_CLIENT_ID" \

-d "client_secret=$FOXIT_ESIGN_CLIENT_SECRET" \

-d "grant_type=client_credentials" \

-d "scope=read-write"

The response carries an access_token field. Pass it as Authorization: Bearer {token} on every subsequent call, and store it server-side so your credentials never travel to the browser.

2. Creating an Envelope Programmatically with /api/folders/createfolder

Foxit eSign calls what other platforms call an “envelope” a folder. If you’re coming from DocuSign or PandaDoc, that naming difference will catch you on the first read of the docs. The endpoint is POST {HOST_NAME}/api/folders/createfolder. A template-based variant also exists at POST {HOST_NAME}/api/templates/createFolder (note the camelCase) for assembling envelopes from saved templates. This tutorial focuses on the direct document-upload flow.

Defining Signature Fields with Text Tags

Foxit eSign reads signature field definitions from the PDF itself at upload time. You embed them as Text Tags directly in the document, and the API parses the tags on ingest and converts them to interactive fields. The tag syntax follows this structure: ${fieldtype:party_number:required:field_name:width}. Here, y marks a field as required and n marks it as optional. The party number maps to the signing sequence for that recipient. Width is expressed as underscores.

A minimal set covering the four field types required for a real signing flow:

${s:1: } # signature field, party 1

${i:1:______} # initials field, party 1

${d:1:n::____} # optional date field, party 1

${t:1:y:Full_Name:__________} # required text field, party 1, named "Full_Name"

The full set of supported tag types includes signfield (or s), initialfield (or i), datefield (or d), textfield (or t), textboxfield (or tb), checkboxfield (or c), radiobuttonfield (or rb), securedfield (or sc), attachmentfield (or a), imagefield (or img), accept (or ab), decline (or db), payfield (or pf), and formulafield (or ff).

To hide tags in production, set the tag font color to match the page background color. Foxit eSign converts the tags to fields but does not strip them from the rendered document.

Use this sample PDF with Text Tags pre-embedded to follow along. It includes a signature field, initials, a date field, and a text input, all mapped to party 1.

Submitting the createfolder Request

You can supply the PDF two ways against the same endpoint. In URL mode you pass a fileUrls array of publicly reachable PDF links alongside a matching fileNames array. In base64 mode you set "inputType": "base64" and pass a base64FileString array of base64-encoded PDF bytes, again with a matching fileNames array. The API rejects a request that supplies neither, returning fileUrls or base64FileString cannot be empty.

One parameter is easy to miss and breaks the whole flow when omitted. Set processTextTags to true so the API parses the Text Tags embedded in the PDF and converts them into interactive fields. Leave it out and the folder still gets created successfully, but the tags stay inert literal text on the page, the signing UI reports zero required fields, and a signer can reach Finish without ever signing. If your source PDF carries native AcroForm fields instead of Text Tags, the companion processAcroFields flag handles those.

URL-based submission via cURL, pointing at the hosted sample PDF:

curl -X POST "https://na1.foxitesign.foxit.com/api/folders/createfolder" \

-H "Authorization: Bearer $FOXIT_ESIGN_ACCESS_TOKEN" \

-H "Content-Type: application/json" \

-d '{

"folderName": "Service Agreement",

"sendNow": false,

"processTextTags": true,

"createEmbeddedSigningSession": true,

"embeddedSignersEmailIds": ["[email protected]"],

"fileUrls": ["https://github.com/lucienchemaly/foxit-demo-templates/raw/main/esign/sample-text-tags.pdf"],

"fileNames": ["sample-text-tags.pdf"],

"parties": [

{

"firstName": "Alex",

"lastName": "Rivera",

"emailId": "[email protected]",

"permission": "FILL_FIELDS_AND_SIGN",

"sequence": 1

}

]

}'

In this request you ask Foxit eSign to fetch the tagged PDF from its public URL, hold the folder as a draft instead of emailing it by setting sendNow to false, and mint an embedded signing session for the party identified in embeddedSignersEmailIds. The single signer is defined in the parties array with a name, email, the FILL_FIELDS_AND_SIGN permission, and a signing sequence.

Base64 upload via Python, which avoids needing a public URL by sending the file bytes inline:

import os

import base64

import requests

  

HOST = "https://na1.foxitesign.foxit.com"

TOKEN = os.environ["FOXIT_ESIGN_ACCESS_TOKEN"]

  

with open("sample-text-tags.pdf", "rb") as pdf:

encoded = base64.b64encode(pdf.read()).decode()

  

payload = {

"folderName": "Service Agreement",

"sendNow": False,

"processTextTags": True,

"inputType": "base64",

"base64FileString": [encoded],

"fileNames": ["sample-text-tags.pdf"],

"createEmbeddedSigningSession": True,

"embeddedSignersEmailIds": ["[email protected]"],

# Recipient defined in the body overrides any tag-defined party

"parties": [

{

"firstName": "Alex",

"lastName": "Rivera",

"emailId": "[email protected]",

"permission": "FILL_FIELDS_AND_SIGN",

"sequence": 1,

}

],

}

  

response = requests.post(

f"{HOST}/api/folders/createfolder",

headers={"Authorization": f"Bearer {TOKEN}"},

json=payload,

)

  

print(response.json())

In this code, you read the local PDF, base64-encode its bytes, and place the result inside the base64FileString array with inputType set to base64 so the API knows to decode it rather than fetch a URL. The rest of the payload mirrors the cURL example, sending the folder as a draft and requesting an embedded session for the listed signer, after which you print the JSON response to read back the folder.folderId and the session URL. When both the PDF’s Text Tags and the API body define recipient parties, the body values take precedence, so you can reuse a tagged PDF template and swap in different signers at request time without touching the document.

3. Generating the Embedded Signing Session and Rendering the iFrame

Setting createEmbeddedSigningSession: true in the createfolder body, paired with an embeddedSignersEmailIds array naming which parties sign in your app, gives you a signed session URL in the same response. No second API call, no separate “recipient view” endpoint. The response carries an embeddedSigningSessions array, and each entry holds the signer email in emailIdOfSigner, the raw token in embeddedToken, and the ready-to-render link in embeddedSessionURL. That URL follows this format, where eetid is the URL-encoded embedded token:

https://{HOST_NAME}/embedded/embeddedsign?eetid={URL-ENCODED-EMBEDDED-TOKEN}

If you omit embeddedSignersEmailIds, the API returns email id of embedded signer(s) not submitted, so always list the embedded signers explicitly. For multi-party workflows you can set createEmbeddedSigningSessionForAllParties: true so every recipient signs in an embedded session rather than over email. When you need each signer’s live URL, request it per signer through the regenerate endpoint described below.

The full lifecycle runs from token request through webhook delivery:

Sequence diagram of embedded signing with the Foxit eSign API, showing token request, folder creation, embedded session URL, iFrame rendering, and webhook completion handling

Injecting the Session URL into an iFrame

The signing UI renders entirely inside the iFrame with no additional JavaScript library required.

function launchSigningSession(embeddedSessionURL) {

const iframe = document.createElement("iframe");

  

// These five sandbox permissions are the minimum required for the signing UI

iframe.setAttribute(

"sandbox",

"allow-scripts allow-same-origin allow-forms allow-popups allow-top-navigation",

);

  

iframe.src = embeddedSessionURL;

iframe.style.width = "100%";

iframe.style.height = "700px";

iframe.style.border = "none";

  

iframe.onload = function () {

console.log("Signing session ready");

};

  

document.getElementById("signing-container").appendChild(iframe);

}

The sandbox attribute matters here. Remove allow-popups or allow-top-navigation and the signing UI breaks in ways that produce no obvious error. The five attributes above are the minimum viable set. Don’t strip them without testing the complete signing flow.

To verify the flow without wiring this into your app first, download the ready-to-run iFrame test page from the demo repo, open it in a browser, paste the embeddedSessionURL from your createfolder response into the input box, and click Load. It applies the same five sandbox permissions shown above. A correctly tagged document renders with the signing controls active, as shown below.

Foxit eSign embedded signing iFrame showing a sample service agreement with Text Tags rendered as interactive required fields for full name, initials, and date

The sample document loaded in the iFrame with processTextTags enabled. The header shows “Required Fields Left: 2” and a “Next Required Field” button, confirming the Text Tags became interactive fields. If that counter reads zero or the page shows the raw ${...} tag text with no input boxes, recheck that processTextTags was set to true on the createfolder request.

Session URLs are short-lived. Generate the URL at request time and pass it directly to the client. Don’t cache it. If a user returns to an incomplete workflow after the session expires, call POST {HOST_NAME}/api/embedded/regenerateEmbeddedSigningSession with the folder ID and the signer email to get a fresh URL. The response mirrors a single embedded session entry, returning emailIdOfSigner, embeddedToken, and the new embeddedSessionURL.

4. White-Labeling the Signing Experience

Foxit eSign exposes branding control at several levels. You can apply a custom logo to the signing UI and outgoing notification emails, set application colors to match your product’s visual design, and configure a personalized sender name so recipients see your company name rather than a generic Foxit sender identity.

For logo and color configuration, manage these settings through the eSign Portal’s branding section. The portal publishes canonical limits on file size and supported formats. Check the branding settings in your account for current specifications rather than relying on numbers printed here that may have changed.

Configuring Post-Signing Redirect URLs

Custom redirect URLs keep users inside your application after they sign, decline, defer, or hit an error. Pass them as parameters in the createfolder body:

payload = {

"folderName": "Service Agreement",

"sendNow": False,

"processTextTags": True,

"createEmbeddedSigningSession": True,

"embeddedSignersEmailIds": ["[email protected]"],

# Return the user to your confirmation page after a successful signature

"signSuccessUrl": "https://app.example.com/contracts/signed",

# Return to a dedicated page when the signer declines

"signDeclineUrl": "https://app.example.com/contracts/declined",

# Return here when the signer chooses to finish later

"signLaterUrl": "https://app.example.com/contracts/later",

# Return here if the signing session errors out

"signErrorUrl": "https://app.example.com/contracts/error",

"parties": [ ... ],

"fileUrls": [ ... ],

"fileNames": [ ... ],

}

Foxit eSign appends two query parameters to your success URL when it redirects, namely folderId for the folder that was signed and event, whose value is signing_success on a completed signature or signing_declined when the signer declines. Without these URLs, signers land on Foxit’s default confirmation page. With them, your application controls the entire post-signing navigation experience.

Tailoring Signer Instructions for Regulated Industries

Regulated workflows often need specific disclosure language in front of signers, which matters for ESIGN Act or eIDAS compliance scenarios where your legal team controls the wording. The createfolder body accepts a signerInstructionId and a confirmationInstructionId that reference instruction templates configured in your account, and you can drop explicit accept and decline button fields into the document itself using the accept (or ab) and decline (or db) Text Tag types. The eSign Developers Guide at developersguide.foxitesign.foxit.com documents these parameters.

5. Handling Webhook Callbacks for Completion Events

Foxit eSign fires HTTP POST requests to your registered endpoint when these lifecycle events occur: folder_sent, folder_viewed, folder_signed, folder_cancelled, folder_completed, folder_executed, and folder_deleted. Register your callback URL in the eSign Portal under API Settings. Make sure the endpoint is publicly reachable over HTTPS before you start testing with the sandbox.

Verifying the Webhook Signature

Every webhook POST includes a signature query parameter. It’s a base64-encoded HMAC-SHA-256 digest of the raw request body, computed using your webhook secret. Recompute the same digest server-side and compare before doing any processing. An unverified webhook is an open door.

import os

import hmac

import hashlib

import base64

from flask import Flask, request, abort

  

app = Flask(__name__)

WEBHOOK_SECRET = os.environ["FOXIT_ESIGN_WEBHOOK_SECRET"].encode()

  

@app.route("/webhook/foxit", methods=["POST"])

def foxit_webhook():

# Step 1: Pull the signature from the query string

received_sig = request.args.get("signature", "")

  

# Step 2: Recompute HMAC-SHA-256 over the raw request body

raw_body = request.get_data()

computed_sig = base64.b64encode(

hmac.new(WEBHOOK_SECRET, raw_body, hashlib.sha256).digest()

).decode()

  

# Step 3: Constant-time comparison guards against timing attacks

if not hmac.compare_digest(received_sig, computed_sig):

abort(403)

  

payload = request.json

event_name = payload.get("event_name")

folder_id = payload.get("data", {}).get("folder", {}).get("folderId")

  

if event_name in ("folder_completed", "folder_executed"):

handle_completion(folder_id)

elif event_name == "folder_cancelled":

handle_cancellation(folder_id)

  

# A non-2xx response triggers Foxit's automatic retry logic.

# Return 200 once verification and basic parsing succeed.

return "", 200

The payload structure is consistent across all events, carrying a top-level event_name, an event_date timestamp, and a data object whose folder field holds the full folder record. The folder identifier lives at data.folder.folderId, alongside the rest of the envelope-level metadata such as folderName and folderStatus.

Downstream Actions Triggered from folder_completed

When folder_completed or folder_executed fires, two actions cover the majority of production workflows. Fetch the signed document via the documents endpoint using the folder ID and store the result in your document storage layer. For contracts that require sequential agreements (a master services agreement followed by a statement of work, for example), fire the next createfolder call as part of the completion handler.

Audit history is available programmatically via GET {HOST_NAME}/api/folders/viewActivityHistory?folderId={FOLDER_ID}, which returns the full signing log once a folder has been shared or sent. This is a GET-only endpoint, and a folder still in DRAFT returns logs of a non-shared folder can not be viewed.

6. Common Mistakes and Troubleshooting

Text Tag Syntax Breaks on Copy-Paste

Smart-quote autocorrect in Word, Google Docs, and many other editors replaces straight ASCII brackets and quote characters with typographic equivalents. Tag parsing fails silently when this happens. Always paste tags into a plain-text editor first and verify the bracket characters are straight ASCII. The eSign Developers Guide writes every field-type notation in lowercase, such as signfield or s and textfield or t, so author your tags in lowercase to match the documented syntax rather than experimenting with capitalized variants.

Visible Text Tags Reaching Production

Foxit eSign converts embedded tags to fields but does not remove them from the document. If you ship a PDF without setting the tag text color to match the page background, signers see the raw ${...} strings on the page. Build the color-hide step into your PDF preparation pipeline before it becomes a support ticket.

Body-Level parties Overriding Tag-Defined Recipients

Tags define field layout and recipient assignment and must be embedded in the PDF itself, while recipient definitions in the API request body override tag-defined recipient metadata. If you’re seeing the wrong signer name or email appear, check whether a body-level parties definition is overriding the tag.

Expired Session URLs

Embedded session URLs are short-lived. Caching one and reusing it on the next page load will fail. Call POST {HOST_NAME}/api/embedded/regenerateEmbeddedSigningSession with the folder ID and signer email each time a returning user needs access to an incomplete session.

Over-Restrictive sandbox on the iFrame

The five required sandbox permissions are allow-scripts, allow-same-origin, allow-forms, allow-popups, and allow-top-navigation. If the UI loads but behaves unexpectedly, check your sandbox attributes first.

Credential Confusion Between eSign and PDF Services

The API Key and API Secret from the eSign Portal are specific to the eSign API, and the OAuth 2.0 flows also differ between the two. Using PDF Services credentials against the eSign /api/oauth2/access_token endpoint returns an authentication error. Keep the two credential sets separate and named clearly in your environment configuration.

Skipping Webhook Signature Verification

Always verify the signature query parameter before processing any payload. Return a 200-class status code once verification and basic parsing succeed, because a non-2xx response causes Foxit to retry delivery, which can create duplicate processing if your handler is not idempotent.

FAQ

Can I regenerate an expired embedded signing session?

Yes. Call POST {HOST_NAME}/api/embedded/regenerateEmbeddedSigningSession with the folder ID and signer email. Foxit eSign returns a fresh embeddedSessionURL for the same envelope without resetting the signing state.

Do I need a separate Foxit account if I’m already using Foxit PDF Services?

Yes. The Foxit eSign API operates from a separate portal at na1.foxitesign.foxit.com with its own credentials. The two portals don’t share API keys, secrets, or authentication tokens.

Can I use an existing template instead of uploading a PDF?

Yes. Use POST {HOST_NAME}/api/templates/createFolder to assemble the envelope from a saved template rather than a raw document upload.

Does the embedded signing session work on mobile browsers?

Yes. The iFrame renders responsively on modern mobile browsers without additional configuration.

Is the audit trail accessible programmatically?

Yes. GET {HOST_NAME}/api/folders/viewActivityHistory?folderId={FOLDER_ID} returns the full signing activity log for any shared or sent envelope, including timestamps for each event. A folder still in DRAFT has no shared history to return.

What is the difference between folder_completed and folder_executed?

Both events signal that the folder has been completed with all required parties’ signatures, and the eSign Developers Guide describes them in the same terms, each delivering the folder record in data.folder. Listen for either to trigger downstream retrieval of the signed document, and make your handler idempotent so receiving both for the same folder does not double-process it.

Next Step

Activate your free Foxit eSign developer account at account.foxit.com/site/sign-up, no credit card required. Generate your OAuth token, fire a POST /api/folders/createfolder request with createEmbeddedSigningSession: true against the sandbox, and verify the returned URL loads in a local iFrame. From account creation to a working embedded signing session takes under 30 minutes.

Automating Financial Document Workflows with Foxit APIs: Generate Statements, Embed eSign, and Extract Audit-Ready Data

Document automation financial services pipeline using Foxit APIs to generate statements, embed eSign, and extract audit-ready data.

Learn how to automate financial services document workflows using Foxit APIs, covering quarterly statement generation, embedded eSign for account onboarding, and audit-ready PDF/A archiving with PII redaction.

The standard financial services document pipeline looks fine until a compliance audit exposes it. A templating tool generates quarterly statements. A standalone eSign vendor handles account onboarding. A manual export process (or a fragile ETL job nobody fully owns) produces the data package your audit team needs. Three vendor contracts, three auth systems, and three event logs that stop at their own API boundaries.

The more durable architecture treats document generation, e-signature orchestration, and audit-ready data extraction as a single API-backed pipeline. This article wires that pipeline together end to end, using the Foxit Document Generation API for quarterly statements, the Foxit eSign API for embedded onboarding signatures, and the Foxit PDF SDK plus Smart Redact Server for downstream extraction, PDF/A archiving, and PII scrubbing. All examples run against a free Foxit developer account.

Prerequisites

Before you run any code, get two accounts and one workspace set up.

Accounts. Create your Foxit developer account at https://account.foxit.com/site/sign-up, where activation is instant and includes free credits. Retrieve your client_id and client_secret from the Foxit Developer Portal. The eSign API runs on a different platform and requires a separate account in the Foxit eSign Portal. Activate API access under the API tab in the eSign settings menu, then fill out the form to receive your API Key and API Secret. These are not the same credentials as your DocGen account, and that distinction matters for every auth call in Section 3.

SDK and license files. If you plan to run Section 4’s PDF/A archive step, also request a Foxit PDF SDK trial through the Foxit Developer Hub. The PyPI wheel covered in Section 4 is the runtime binary your code imports, and the trial download provides gsdk_sn.txt and gsdk_key.txt, which are the credentials Library.Initialize requires. Confirm the license you receive lists Compliance on its Modules= line, since the archive step needs that module. Evaluation licenses also carry a fixed expiry window from the issue date (the distribution validated for this article was issued for a 36-day window), so request a fresh trial through the portal if yours has lapsed.

Runtime. You’ll need Python 3.8+ and cURL. Install jq if you want to inspect JSON responses inline. VS Code with the Python extension works well as a default; PyCharm and Sublime Text both work fine too.

Workspace bootstrap. Run this block to get a clean isolated environment:

mkdir foxit-financial && cd foxit-financial
python3 -m venv .venv && source .venv/bin/activate
pip install requests

Credentials. Never hardcode API keys. Set these five environment variables before running any snippet in this article:

export BASE_URL="https://na1.fusion.foxit.com"
export DOCGEN_CLIENT_ID="your_docgen_client_id"
export DOCGEN_CLIENT_SECRET="your_docgen_client_secret"
export ESIGN_API_KEY="your_esign_api_key"
export ESIGN_API_SECRET="your_esign_api_secret"

1. Why Three-Vendor Document Pipelines Break Under Compliance Pressure

The fragmentation pattern is consistent across mid-market brokerages and fintechs, where one vendor generates documents, another handles signatures, and a third (or an internal script) handles data extraction for audit packages. Each seam creates a specific compliance problem.

When a client signs their account agreement in the eSign portal, that event lives in the eSign vendor’s audit log. The generated quarterly statement lives in your templating tool’s system. The trade confirmation data lives in your data warehouse. None of these systems talk to each other by default. If an examiner asks for a unified event trail (who generated the document, who signed it, when, and where the underlying data went), you’re assembling that answer manually from three separate exports. That’s a control gap, not just an inconvenience.

There’s also a maintenance cost. Every vendor boundary means a separate OAuth2 registration, separate webhook configuration, separate error-handling logic, and separate retry strategies. When the eSign vendor rotates their API endpoint or introduces a breaking schema change, your generation pipeline doesn’t know about it. The two systems are coupled only through your code, which means you absorb every upstream change.

The architectural alternative collapses those seams. A single REST API surface covers generation (na1.fusion.foxit.com), signing (na1.foxitesign.foxit.com), archiving, and redaction. OAuth2 scopes control access at each stage. Webhooks propagate state across systems so a signature event on an account agreement can trigger downstream archival automatically, with no polling job or cron script required.

Foxit document automation financial services pipeline: Word template to DocGen, eSign, PDF/A, and audit-ready archive.

The rest of this article walks each stage of that pipeline with working code.

2. Auto-Generating Quarterly Statements with the DocGen API

Step 1: Validate Your Template with the AnalyzeDocumentBase64 Endpoint

Download the ready-to-use quarterly statement template here: quarterly_statement.docx. The template uses double-bracket text tags for flat client metadata ({{client_name}}, {{account_number}}, {{statement_period}}, {{portfolio_value}}) plus a {{TableStart:holdings}} / {{TableEnd:holdings}} loop for portfolio positions.

Before wiring up your data pipeline, call the AnalyzeDocumentBase64 endpoint to confirm the API can parse every tag. This catches naming mismatches before they produce silent blank fields in production.

curl -X POST "${BASE_URL}/document-generation/api/AnalyzeDocumentBase64" \

-H "client_id: ${DOCGEN_CLIENT_ID}" \

-H "client_secret: ${DOCGEN_CLIENT_SECRET}" \

-H "Content-Type: application/json" \

-d '{

"base64FileString": "'$(base64 -i quarterly_statement.docx)'",

"fileType": "docx"

}'

The response returns a singleTagsString (comma-separated list of scalar tags and loop column names) and a doubleTagsString (comma-separated list of loop names). For quarterly_statement.docx, the response should be {"singleTagsString":"client_name,account_number,statement_period,portfolio_value,ROW_NUMBER,symbol,quantity,marketValue","doubleTagsString":"holdings"}. Verify that every tag your data payload will populate appears in one of those two strings before you proceed. If a tag is missing, check whether Word split it across multiple text runs (see the Common Mistakes appendix).

Step 2: Generate the PDF with GenerateDocumentBase64

The GenerateDocumentBase64 endpoint accepts the Word template as a base64-encoded string plus a JSON data payload and returns the rendered document (also base64-encoded) in the same synchronous HTTP response. No polling required.

import os, base64, requests

  

DOCGEN_URL = "https://na1.fusion.foxit.com/document-generation/api/GenerateDocumentBase64"

CLIENT_ID = os.environ["DOCGEN_CLIENT_ID"]

CLIENT_SECRET = os.environ["DOCGEN_CLIENT_SECRET"]

  

# Load and encode the Word template

with open("quarterly_statement.docx", "rb") as f:

encoded_template = base64.b64encode(f.read()).decode()

  

# Build the data payload

payload = {

"base64FileString": encoded_template,

"fileType": "docx",

"outputFormat": "pdf",

"documentValues": {

"client_name": "Alex Rivera",

"account_number": "ACC-20241231-0042",

"statement_period": "Q4 2024",

"portfolio_value": "$248,750.00",

"holdings": [

{"symbol": "AAPL", "quantity": "50", "marketValue": "$9,100.00"},

{"symbol": "MSFT", "quantity": "30", "marketValue": "$12,360.00"},

{"symbol": "VTSAX", "quantity": "400", "marketValue": "$45,200.00"},

],

},

}

  

resp = requests.post(

DOCGEN_URL,

json=payload,

headers={

"client_id": CLIENT_ID,

"client_secret": CLIENT_SECRET,

"Content-Type": "application/json",

},

)

resp.raise_for_status()

  

# Decode and save the rendered PDF

pdf_bytes = base64.b64decode(resp.json()["base64FileString"])

with open("statement_q4_2024_rivera.pdf", "wb") as f:

f.write(pdf_bytes)

  

print(f"Generated: {len(pdf_bytes):,} bytes")

In this code, you load the Word template off disk, base64-encode it, build a documentValues dict that mirrors the template tags exactly (flat scalars for client metadata, an array of objects for the holdings loop), POST the payload to GenerateDocumentBase64, decode the base64FileString field from the JSON response back into raw PDF bytes, and persist the result. The data contract is straightforward, since flat scalar keys map to single-value tags, the holdings array of objects drives the {{TableStart:holdings}} loop, and each object in holdings needs the same keys as the column tags inside the loop (symbol, quantity, marketValue).

Step 3: Batch Generation and the 4 MB Limit

The DocGen API enforces a 4 MB cap on the base64FileString payload, measured as the base64-encoded size and not the raw .docx on disk. A 2.5 MB Word file encodes to roughly 3.3 MB in base64, which leaves little room for embedded images or fonts.

When the cap is exceeded, the API returns HTTP 500 with the plain-text body An error occurred while analyzing the template: Document file contents cannot be larger than 4 MB. To recover, slim the template. Compress images via Word’s Picture Format → Compress Pictures pane (targeting screen resolution is usually enough for PDF output), remove any embedded OLE objects, and drop embedded fonts if they aren’t required for rendering. If the template genuinely needs to stay large, split it into multiple templates and merge the rendered PDFs downstream.

For quarterly runs across thousands of accounts, parallelise POST requests against GenerateDocumentBase64 using a thread pool. The API is stateless and synchronous, so scaling means concurrent requests against your credit budget, not job-queue management.

from concurrent.futures import ThreadPoolExecutor

  

def generate_statement(client_record):

payload["documentValues"] = client_record

r = requests.post(DOCGEN_URL, json=payload, headers=headers)

r.raise_for_status()

return base64.b64decode(r.json()["base64FileString"])

  

with ThreadPoolExecutor(max_workers=10) as pool:

pdfs = list(pool.map(generate_statement, client_records))

If your batch jobs need explicit async task tracking with status polling, the Foxit PDF Services API exposes that pattern. For statement generation at quarterly cadence, the synchronous loop above is simpler and just as reliable.

3. Embedding eSign Flows for Account Onboarding

Authentication: A Separate Account on a Separate Host

The eSign API runs on na1.foxitesign.foxit.com, not the na1.fusion.foxit.com host used for DocGen. It also uses a different developer account and a different auth model. DocGen takes client_id and client_secret directly as request headers on every call. eSign requires a proper OAuth2 client_credentials exchange first, then a bearer token on subsequent calls.

Exchange your API Key and API Secret for an access token:

curl -X POST "https://na1.foxitesign.foxit.com/api/oauth2/access_token" \

-d "grant_type=client_credentials" \

-d "client_id=${ESIGN_API_KEY}" \

-d "client_secret=${ESIGN_API_SECRET}" \

-d "scope=read-write"

The response includes access_token. Pass it as Authorization: Bearer <token> on every subsequent eSign API call. Tokens expire, so cache them with their expires_in value and refresh before expiry rather than on every request.

Document Setup with Text Tag Tokens

Download the ready-to-use account agreement here: account_agreement.pdf. The document embeds signature, date, and initials fields using dollar-brace Text Tag tokens. The token format uses colon-delimited segments for field type, party number, mandatory flag, and a placeholder string.

Three tokens cover the common onboarding case:

  • ${signfield:1:y:____} is a mandatory signature for party 1

  • ${datefield:2:n::____} is an optional date for party 2

  • ${i:2:n} is an optional initials field for party 2

The party number in each token drives multi-party routing automatically. Party 1 sees their signature field; party 2 sees the date and initials fields. No additional routing configuration is needed in the API call, since the document itself encodes the routing.

To create the folder and send it for signature in one call, POST to /folders/createfolder with the document URL, file name, parties array, and sendNow: true. Setting processTextTags: true instructs Foxit eSign to parse the dollar-brace tokens out of the PDF text layer and convert them into the appropriate form fields. The party fields are permission (signing role, typically FILL_FIELDS_AND_SIGN) and sequence (party order within the folder); the request schema does not use partyRole or partyNumber.

curl -X POST "https://na1.foxitesign.foxit.com/api/folders/createfolder" \

-H "Authorization: Bearer ${ACCESS_TOKEN}" \

-H "Content-Type: application/json" \

-d '{

"folderName": "Account Agreement - Alex Rivera",

"fileUrls": ["https://github.com/lucienchemaly/foxit-demo-templates/raw/main/account_agreement.pdf"],

"fileNames": ["account_agreement.pdf"],

"processTextTags": true,

"signInSequence": false,

"sendNow": true,

"parties": [

{

"firstName": "Alex",

"lastName": "Rivera",

"emailId": "[email protected]",

"permission": "FILL_FIELDS_AND_SIGN",

"sequence": 1,

"workflowSequence": 1

}

]

}'

The request above creates a draft folder for Alex Rivera, attaches the published account_agreement.pdf by URL, asks Foxit to parse the embedded Text Tag tokens, and dispatches the folder for signature in a single round trip. The folder status moves through DRAFTSHAREDWAITING_FOR_SIGNATUREEXECUTED as the signing process advances. If you prefer a two-step flow (create the draft, then dispatch later), POST the same body with sendNow: false to /folders/createfolder and follow up with a POST /folders/sendDraftFolder carrying the returned folderId. The webhook response uses different field names than the request for these party properties (contractPermissions, partySequence, workflowSignSequence), so map them accordingly when you persist signing events.

Webhook Integration: Seven Events, One That Matters Most for Archival

Register your callback URL in the eSign developer portal under Settings → Webhooks. The eSign API exposes seven webhook events covering the full folder lifecycle:

  • folder_sent, when the folder is dispatched to all parties

  • folder_viewed, when any party opens the folder (payload adds viewing_party)

  • folder_signed, when any party signs (payload adds signing_party)

  • folder_cancelled, when any party declines (payload adds cancelling_party and reason_for_cancelling)

  • folder_completed, once all required signatures are collected

  • folder_executed, which fires 5 to 10 seconds after folder_completed once the digital signature is applied to the completed PDF

  • folder_deleted, when the folder is removed (payload adds deleting_party)

Every callback POSTs a JSON body with three top-level keys (event_name, event_date in Unix milliseconds, and data). The data.folder object carries the full folder context including folderId, folderName, folderStatus, folderDocumentIds, documentsList, folderRecipientParties, and bulkId.

Hook folder_signed for per-party CRM and onboarding status updates. Hook folder_executed (not folder_completed) for archival triggers. The distinction matters, because folder_completed fires when all signatures are in but before the digital signature has been applied to the PDF, whereas folder_executed guarantees the downloaded PDF is the final, digitally signed document.

Webhook Security: HMAC on the Raw Body

Configure a Webhook Secret in the eSign API settings page. Foxit eSign computes an HMAC-SHA-256 digest of the raw HTTP request body using that secret, base64 encodes it, and appends it to your callback URL as a signature query parameter (for example, https://your-app.example.com/webhook?signature=XXXXXXXXXXXX).

Verify against the raw body bytes, not against the JSON-decoded and re-serialised payload. Any whitespace or key-ordering difference between parse and re-serialize will break the comparison.

import hmac, hashlib, base64

  

WEBHOOK_SECRET = os.environ["WEBHOOK_SECRET"]

  

def verify_webhook_signature(raw_body: bytes, signature_param: str) -> bool:

computed = base64.b64encode(

hmac.new(WEBHOOK_SECRET.encode(), raw_body, hashlib.sha256).digest()

).decode()

return hmac.compare_digest(computed, signature_param)

  

# In your Flask/FastAPI handler:

  

# raw_body = await request.body() # capture before any parsing

  

# sig = request.query_params["signature"]

  

# if not verify_webhook_signature(raw_body, sig):

  

# return Response(status_code=403)

Rotate the Webhook Secret through the API settings page immediately if it leaks.

Bulk Dispatch for High-Volume Onboarding Campaigns

For campaigns where you’re sending the same agreement to a large group, set partyIsEmailGroup: true on the relevant party in your createfolder body and provide an emailGroupId (the ID of the email group configured in your eSign account). When partyIsEmailGroup is true, the firstName, lastName, and emailId fields on that party are ignored, and the group definition drives the recipients. Set allowSingleSignerInBulk: true if only one member of the group needs to sign to complete the folder. The folder response includes a bulkId field that identifies the bulk run (0 for non-bulk sends).

For onboarding flows that require one personalised folder per recipient (different client data, different document content), parallelise calls to /folders/createfolder from a worker pool sized to your credit budget. No single-call endpoint generates thousands of personalised folders in one shot.

4. Extracting Audit-Ready Data for Financial Document Workflows

Server-Side Text and Table Extraction with the PDF SDK

The Foxit PDF SDK handles programmatic text and table extraction server-side, which eliminates the manual export step that breaks most audit pipelines. Install the Python wrapper from PyPI for the compiled runtime binary (arm64 macOS, x64 Linux, and x64 Windows wheels are published):

pip install FoxitPDFSDKPython3

The PyPI wheel ships the runtime binary only. To obtain the serial number and license key that Library.Initialize requires, request a trial through the Foxit Developer Hub and read gsdk_sn.txt and gsdk_key.txt from the unpacked archive. Foxit evaluation licenses are bound to the SDK distribution they ship alongside, so request the trial through the route you actually plan to run. The PyPI wheel needs a license issued for it, not one extracted from the legacy x86_64 macOS download (that bundle is x86_64-only and was built against Python 2, which is incompatible with both arm64 Apple Silicon hosts and modern Python 3 runtimes).

The SDK supports text extraction at the page level and table detection from structured content, which is useful for pulling portfolio positions, transaction histories, and account summaries into JSON before writing to an audit package or data warehouse.

PDF/A Conversion: Convert, Then Verify, Then Reject on Failure

Generated and signed documents need to pass through the Foxit PDF SDK PDF/A Compliance add-on before archiving. The Python binding flattens the underlying C++ namespaces, so the PDFACompliance class lives directly on the top-level FoxitPDFSDKPython3 module (not under a nested addon submodule). It exposes two main methods, namely ConvertPDFFile() to produce a compliant output and Verify() to check an existing PDF against a target version. Initialize the SDK once at process start via foxit.Library.Initialize(sn, key) (the call returns an int error code, not an exception), then boot the ComplianceEngine before instantiating PDFACompliance.

Pull the values for FOXIT_SDK_SN and FOXIT_SDK_KEY from the SDK trial download referenced in the prerequisites. gsdk_sn.txt ships as a single line of the form SN=<value>, and the literal SN= prefix must be stripped before the value is exported (passing the whole line yields e_ErrInvalidLicense from a key that would otherwise work). gsdk_key.txt is an INI-style file starting with [Foxit SDK License], and the value passed to Library.Initialize is only the contents of the Sign= line (the long base64 blob), not the full INI file. Passing the full file yields e_ErrInvalidLicense even with the correct SN. ComplianceEngine.Initialize takes the path to the compliance resource folder as its first argument, and the second argument is the engine unlock code (an empty string for trial keys). The resource folder is the res/ directory inside the SDK trial download; point FOXIT_COMPLIANCE_RESOURCE_FOLDER at that path.

The version enums are direct class attributes (no PDFACompliance.Version sub-namespace). Supported versions span PDF/A-1 through PDF/A-3, including e_VersionPDFA1a, e_VersionPDFA1b, e_VersionPDFA2a, e_VersionPDFA2b, e_VersionPDFA2u, e_VersionPDFA3a, e_VersionPDFA3b, and e_VersionPDFA3u, aligned with ISO 19005-1, 19005-2, and 19005-3. PDF/A-4 (ISO 19005-4) is not in the enum and is not supported.

import os

import logging

import FoxitPDFSDKPython3 as foxit

  

logger = logging.getLogger(__name__)

  

SDK_SN = os.environ["FOXIT_SDK_SN"]

SDK_KEY = os.environ["FOXIT_SDK_KEY"]

COMPLIANCE_RESOURCE_FOLDER = os.environ["FOXIT_COMPLIANCE_RESOURCE_FOLDER"]

  

err = foxit.Library.Initialize(SDK_SN, SDK_KEY)

if err != foxit.e_ErrSuccess:

raise RuntimeError(f"Foxit SDK Library.Initialize failed with code {err}")

  

# Second argument is the engine unlock code; trial keys use the empty string.

err = foxit.ComplianceEngine.Initialize(COMPLIANCE_RESOURCE_FOLDER, "")

if err != foxit.e_ErrSuccess:

raise RuntimeError(f"Foxit ComplianceEngine.Initialize failed with code {err}")

  

def archive_as_pdfa(src_path: str, dest_path: str) -> bool:

pdfa = foxit.PDFACompliance()

  

# Convert to PDF/A-2b (ISO 19005-2, Level B)

convert_result = pdfa.ConvertPDFFile(

src_path,

dest_path,

foxit.PDFACompliance.e_VersionPDFA2b,

None,

)

if not convert_result.IsEmpty():

# Non-empty ResultInformation means the conversion surfaced

# unresolved compliance issues.

logger.error(

"PDF/A conversion left %d unresolved issues for %s",

convert_result.GetHitDataCount(),

src_path,

)

return False

  

# Verify the output, since silent failure on non-compliant input is the SDK default.

# Signature: Verify(version_enum, file_path, start_page, end_page, progress_callback).

verify_result = pdfa.Verify(

foxit.PDFACompliance.e_VersionPDFA2b,

dest_path,

0,

-1,

None,

)

if not verify_result.IsEmpty():

logger.error(

"PDF/A verification found %d violations in %s",

verify_result.GetHitDataCount(),

dest_path,

)

return False

  

return True

The code initializes the SDK with Library.Initialize (the v11.1.0 Python binding’s entry point), boots the ComplianceEngine, converts the source PDF to PDF/A-2b, and verifies the output. Both ConvertPDFFile() and Verify() return a ResultInformation object, and an empty result (IsEmpty() == True) indicates a clean run with no unresolved compliance issues. Log non-empty results explicitly with the hit count from GetHitDataCount(), because a non-compliant PDF passed silently to your archive is exactly the gap that surfaces during an SEC or FINRA examination.

PII Scrubbing Before Audit Delivery

Route documents through Foxit Smart Redact Server as a pipeline stage before external delivery or third-party audit access. AI-assisted detection identifies SSNs, account numbers, credit card numbers, names, emails, and phone numbers across 47+ supported file types, including PDF, Word, Excel, HTML, JSON, and XML.

Smart Redact Server protects documents with AES-256 encryption at rest and SSL 2048-bit encryption in transit. It operates under a zero data retention policy, so originals and intermediate files are deleted after processing. The Smart Redact Security and Privacy page documents one nuance, where sensitive findings may be stored in encrypted form for follow-up review actions even though the source document itself is not retained. Wire Smart Redact as the final stage before delivery, not as an afterthought applied to a subset of documents.

5. Compliance Controls You Can Actually Audit

Audit Trail Coverage

The eSign API captures a timestamped event log for every signing action, including who signed, when, from what IP address, and with what authentication method. These events are accessible programmatically through the API, not just through the portal UI. That means you can pipe signing events directly into your SIEM or compliance reporting system without manual exports. The folder_signed webhook event delivers enough detail per party to satisfy most per-transaction audit requirements.

Encryption and Access Control

Every Foxit API tier in this pipeline shares the same encryption posture, with TLS 1.2 or higher in transit and AES-256 at rest. The Foxit API Security and Compliance page documents this posture for both the eSign API and the PDF Services/Embed APIs (which covers the DocGen endpoints used in Section 2), alongside SOC 2 Type II certification, segmented customer data at rest, and HIPAA BAA availability.

OAuth2 scopes control which services can read, write, or execute at each pipeline stage. Structure scope grants to enforce least-privilege access between your generation, signing, and archiving services. A job that only reads signed documents shouldn’t hold a read-write token for the generation endpoint.

Retention and Purge

The eSign API supports applying retention rules and triggering purges of outdated records programmatically. For FINRA Rule 4511 and SEC Rule 17a-4 compliance, you need both provable retention (records preserved for the required period) and provable deletion (records removed at end of retention). An API-driven purge that produces a deletion receipt is far easier to defend in an examination than a manual deletion from a portal UI.

6. Start with the Statement Generation Pipeline Today

The fastest way to harden your financial document workflows is to wire the statement generation stage first and let the rest of the pipeline follow. Create a free developer account at https://account.foxit.com/site/sign-up; no credit card is required and activation is instant. Retrieve your client_id and client_secret from the Foxit Developer Portal.

Download the Postman collection from the developer portal, load the AnalyzeDocumentBase64 request, attach quarterly_statement.docx, and fire it. The response lists every detected placeholder, so you can confirm tag names match your data schema before writing a single line of integration code.

Take the tag list from the Analyze response, construct a minimal JSON payload with one client record, POST it to GenerateDocumentBase64, and verify the PDF output renders locally. Once that loop closes, you have a working proof of concept for the statement generation stage, and the rest of the pipeline follows the same pattern.

Appendix: Common Mistakes

DocGen and eSign use different auth models. DocGen takes client_id and client_secret as headers on every request. eSign requires the OAuth2 client_credentials exchange against /api/oauth2/access_token first, then a bearer token on subsequent calls. They’re different accounts in different portals.

Word’s autocorrect splits tags typed character-by-character across runs. This makes the placeholder unparseable and renders a blank field. Always paste tags in as plain text, then verify with Show/Hide formatting marks ().

Smart quotes inside format strings render blank. Disable smart quotes in AutoCorrect Options before authoring the template. A format tag like {{portfolio_value # "$#,##0.00"}} breaks silently if Word replaces the double quotes with curly equivalents.

TableStart and TableEnd for the same array must sit in cells of the same row of the same Word table. Different rows or different tables produce a 400 or a silent blank.

HTTP 500 with the body Document file contents cannot be larger than 4 MB means the base64-encoded .docx exceeded the DocGen 4 MB cap. Slim the template per the Section 2 guidance.

Webhook HMAC verification must run on the raw body bytes, not the parsed JSON. Whitespace normalisation or key-ordering changes between receipt and re-serialisation break the comparison. Capture request.body() before any parsing.

folder_executed is the correct archival hook, not folder_completed. folder_completed fires when all required signatures are collected, but the digital signature hasn’t been applied to the PDF yet. folder_executed fires 5 to 10 seconds later once the digital signature is embedded, and that’s the download-ready version.

Do not claim PDF/A-4 support. The PDFACompliance version enum covers PDF/A-1 (a, b), PDF/A-2 (a, b, u), and PDF/A-3 (a, b, u) only, aligned with ISO 19005-1/2/3. Always branch on the ResultInformation return value, since the SDK does not raise on non-compliant input.

Financial Services Document Automation FAQ

Document automation for financial services is the use of APIs to programmatically generate, sign, and archive client-facing documents like quarterly statements, account agreements, and compliance disclosures without manual intervention. Rather than stitching together separate tools for each stage, a unified API pipeline handles the full lifecycle from template rendering through audit-ready archival, reducing vendor fragmentation and closing compliance control gaps.

The Foxit DocGen API accepts a Word template with double-bracket tags (e.g. {{client_name}}, {{TableStart:holdings}}) and a JSON data payload, then returns a rendered PDF in the same synchronous HTTP response with no polling required. You call AnalyzeDocumentBase64 first to validate every template tag against your data schema, then GenerateDocumentBase64 with your client record to produce the statement. The API enforces a 4 MB cap on the base64-encoded template payload.

folder_completed fires once all required signatures are collected, but before the digital signature has been applied to the PDF. folder_executed fires 5 to 10 seconds later, once the digital signature is embedded and the document is finalized. For archival triggers in financial workflows where you need the legally binding, digitally signed version, folder_executed is the correct webhook event to hook, not folder_completed.

Foxit eSign computes an HMAC-SHA-256 digest of the raw HTTP request body using your configured Webhook Secret, base64-encodes it, and appends it to your callback URL as a signature query parameter. Verification must run against the raw body bytes before any JSON parsing, because whitespace normalization or key-reordering during parse and re-serialization will break the comparison. Use hmac.compare_digest() for a timing-safe check.

The Foxit PDF SDK PDFACompliance class supports PDF/A-1 (a, b), PDF/A-2 (a, b, u), and PDF/A-3 (a, b, u), aligned with ISO 19005-1, 19005-2, and 19005-3. PDF/A-4 is not supported. For SEC Rule 17a-4 and FINRA Rule 4511 compliance, PDF/A-2b is the recommended target. Always call Verify() after ConvertPDFFile() because the SDK does not raise an exception on non-compliant input; it returns a ResultInformation object you must explicitly check.

DOCX to PDF via the Foxit PDF Services API: Python and cURL Walkthrough

Foxit docx to pdf api four-step conversion flow in Python and cURL.

This walkthrough covers the full DOCX-to-PDF flow on the Foxit PDF Services API with runnable Python and cURL for each call.

Automated document pipelines demand conversion tooling that accepts a file, queues a job, and returns clear status at every step. The Foxit PDF Services API gives you exactly that: a four-endpoint async flow covering upload, convert, poll, and download. Each step returns a typed payload, the task model exposes four explicit states with a numeric progress field, and error codes map cleanly to distinct recovery paths.

This tutorial walks through every step of that flow in Python 3 with the requests library, plus cURL equivalents for each call. You’ll have a runnable convert.py script you can drop into a pipeline today.

Prerequisites

Before you run a single line of this tutorial, get the following in place. Each item links to its canonical install or setup guide.

  • Python 3.8 or newer — verify with python3 --version. The script uses only standard library modules plus one external package, so any modern 3.x will do.
  • pip — bundled with Python 3.4+. Verify with python3 -m pip --version.
  • A virtual environment — isolates project dependencies so they don’t collide with system Python or other projects. See the venv tutorial for platform-specific activation commands.
  • The requests library — the only third-party dependency in this walkthrough. Installed inside the venv below.
  • A code editor — Visual Studio Code with the Python extension is a solid default, but PyCharmSublime Text, or any editor you like will work.
  • cURL — pre-installed on macOS and most Linux distros. Windows users can install from the official site or use WSL.
  • A Foxit Developer account — register for free (no credit card required). The Foxit Developer Portal provisions a default application with your CLIENT_ID and CLIENT_SECRET immediately after signup.

Set up the project workspace:

mkdir foxit-docx-to-pdf && cd foxit-docx-to-pdf
python3 -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install requests

Export your credentials as environment variables so the script never sees them as hardcoded strings:

export CLIENT_ID=your_client_id_here
export CLIENT_SECRET=your_client_secret_here
Your Python script reads them with `os.environ.get()`:
import os

CLIENT_ID = os.environ.get("CLIENT_ID")
CLIENT_SECRET = os.environ.get("CLIENT_SECRET")

BASE_URL = "https://na1.fusion.foxit.com"

All four API calls go to https://na1.fusion.foxit.com. The developer portal also offers a live sandbox and pre-built Postman collections if you want to verify calls in a GUI before scripting.

For a sample DOCX to work with right away, download input.docx directly from the foxitsoftware/developerapidemos GitHub repository and save it to your working directory.

How the Auth Model Works

The Foxit PDF Services API authenticates through named request headers. Pass client_id and client_secret directly on every call:

headers = {
    "client_id": CLIENT_ID,
    "client_secret": CLIENT_SECRET,
}
That single dict covers upload (multipart POST), polling (GET), and download (GET). The convert endpoint takes a JSON body, so it requires `Content-Type: application/json` as well:
json_headers = {
    "client_id": CLIENT_ID,
    "client_secret": CLIENT_SECRET,
    "Content-Type": "application/json",
}

The API expects the raw key/secret pair in those named headers. Wrapping credentials in an Authorization: Bearer header instead returns 400, since the required client_id and client_secret headers are missing.

Step 1 and Step 2: Upload the DOCX and Initiate Conversion

Step 1: Upload the DOCX File

POST /pdf-services/api/documents/upload accepts the file as multipart/form-data and returns a documentId that every subsequent call needs.

import requests

def upload_doc(file_path: str) -> str:
    url = f"{BASE_URL}/pdf-services/api/documents/upload"
    headers = {
        "client_id": CLIENT_ID,
        "client_secret": CLIENT_SECRET,
    }
    with open(file_path, "rb") as f:
        files = {"file": (os.path.basename(file_path), f)}
        response = requests.post(url, headers=headers, files=files)
    response.raise_for_status()
    return response.json()["documentId"]

cURL equivalent:

curl -X POST "https://na1.fusion.foxit.com/pdf-services/api/documents/upload" \
  -H "client_id: $CLIENT_ID" \
  -H "client_secret: $CLIENT_SECRET" \
  -F "[email protected]"

Uploaded files carry a 100 MB cap and are automatically deleted after 24 hours. A documentId scopes to the current upload session and expires with the source file, so treat it as ephemeral.

Step 2: Initiate the PDF Conversion

POST /pdf-services/api/documents/create/pdf-from-word accepts a JSON body with the documentId and returns a taskId. The API handles 10 to 10,000+ conversions per day across production pipelines, queuing jobs asynchronously to avoid blocking the connection until the PDF is ready.

import json

def convert_to_pdf(document_id: str) -> str:
    url = f"{BASE_URL}/pdf-services/api/documents/create/pdf-from-word"
    headers = {
        "client_id": CLIENT_ID,
        "client_secret": CLIENT_SECRET,
        "Content-Type": "application/json",
    }
    payload = {"documentId": document_id}
    response = requests.post(url, headers=headers, data=json.dumps(payload))
    response.raise_for_status()
    return response.json()["taskId"]

cURL equivalent:

curl -X POST "https://na1.fusion.foxit.com/pdf-services/api/documents/create/pdf-from-word" \
  -H "client_id: $CLIENT_ID" \
  -H "client_secret: $CLIENT_SECRET" \
  -H "Content-Type: application/json" \
  -d '{"documentId": "<your_document_id>"}'

The endpoint returns 202 Accepted, confirming the job is queued. It also accepts .doc.rtf.dot.dotx.docm.dotm, and .wpd files through the same documentId input, so legacy Word formats work through the same pipeline.

Step 3: Polling the Task Status

GET /pdf-services/api/tasks/{task-id} returns four fields you need to act on in your polling loop:

  • status: one of PENDINGIN_PROGRESSCOMPLETED, or FAILED
  • progress: int32, 0 to 100
  • resultDocumentId: populated when status reaches COMPLETED
  • error: populated when status reaches FAILED

The task state machine advances in one direction: PENDING to IN_PROGRESS, then to either COMPLETED or FAILED.

Foxit DOCX to PDF API task state machine: PENDING to IN_PROGRESS, then COMPLETED with resultDocumentId or FAILED with an error object.

import time

def poll_task(task_id: str, max_attempts: int = 30) -> str:
    url = f"{BASE_URL}/pdf-services/api/tasks/{task_id}"
    headers = {
        "client_id": CLIENT_ID,
        "client_secret": CLIENT_SECRET,
    }
    for attempt in range(max_attempts):
        response = requests.get(url, headers=headers)
        response.raise_for_status()
        data = response.json()
        status = data.get("status")
        progress = data.get("progress", 0)
        print(f"Attempt {attempt + 1}: status={status}, progress={progress}%")
        if status == "COMPLETED":
            return data["resultDocumentId"]
        if status == "FAILED":
            raise RuntimeError(f"Conversion failed: {data.get('error')}")
        time.sleep(2)
    raise TimeoutError(f"Task {task_id} did not complete in {max_attempts} attempts")

Two-second polling intervals work across a wide range of document sizes, and polling more aggressively only consumes rate limit budget without affecting conversion time.

Step 4: Downloading the Converted PDF

GET /pdf-services/api/documents/{documentId}/download fetches the finished PDF. The path parameter in the API reference reads {documentId}, but the value you pass here is the resultDocumentId from the completed poll response. The server assigns that ID to the generated PDF output at conversion time, making it the correct identifier to use at this step.

Stream the response to disk with stream=True and iter_content(chunk_size=8192). Buffering a large PDF fully into memory before writing it causes problems on high-volume pipelines.

def download_result(result_document_id: str, output_path: str) -> None:
    url = f"{BASE_URL}/pdf-services/api/documents/{result_document_id}/download"
    headers = {
        "client_id": CLIENT_ID,
        "client_secret": CLIENT_SECRET,
    }
    with requests.get(url, headers=headers, stream=True) as response:
        response.raise_for_status()
        with open(output_path, "wb") as f:
            for chunk in response.iter_content(chunk_size=8192):
                if chunk:
                    f.write(chunk)

The cURL equivalent uses the --output flag to write directly to disk:

curl -X GET "https://na1.fusion.foxit.com/pdf-services/api/documents/<result_document_id>/download" \
  -H "client_id: $CLIENT_ID" \
  -H "client_secret: $CLIENT_SECRET" \
  --output output.pdf

To verify the output, check response.headers.get("Content-Type") for application/pdf, or inspect the first four bytes of the written file for the %PDF magic bytes if your pipeline requires format validation.

Error Handling for Production

The Foxit PDF Services API documentation covers 400, 404, 413, and 500 across the four endpoints. The 401 appears on authentication failures as a practical case even though it’s absent from the documented example responses. Each status code points to a specific root cause with a concrete recovery path:

  • 400: malformed request body or unsupported file type. Validate the input file path and extension before calling upload_doc().
  • 401: credential misconfiguration. Verify that CLIENT_ID and CLIENT_SECRET are exported in your shell and that the header names are lowercase client_id and client_secret.
  • 404: the documentId has expired. The server deletes uploaded files after 24 hours, so the convert and download endpoints return 404 for any documentId past that window. Re-upload the source file and restart from the upload step. An expired or unknown taskId on the poll endpoint behaves differently: it returns HTTP 200 with status: "FAILED" and an error object whose message reads "task is not exist". The poll loop’s FAILED branch already catches that case.
  • 413: file exceeds the 100 MB upload cap. Pre-check with os.path.getsize() before uploading, or split the document.
  • 500: transient server error. Apply exponential backoff with a ceiling of 3 retries (wait times of 1s, 2s, and 4s).
def call_with_retry(fn, *args, max_retries: int = 3, **kwargs):
    for attempt in range(max_retries + 1):
        try:
            return fn(*args, **kwargs)
        except requests.HTTPError as e:
            code = e.response.status_code
            if code == 400:
                raise ValueError(
                    "Bad request. Confirm the input is a supported Word format."
                ) from e
            if code == 401:
                raise PermissionError(
                    "Authentication failed. Check CLIENT_ID and CLIENT_SECRET env vars."
                ) from e
            if code == 404:
                raise FileNotFoundError(
                    "Document or task expired (24h TTL). Re-upload and retry."
                ) from e
            if code == 413:
                raise OverflowError(
                    "File too large. The upload cap is 100 MB."
                ) from e
            if code == 500 and attempt < max_retries:
                wait = 2 ** attempt  # 1s, 2s, 4s
                print(f"Server error. Retrying in {wait}s ({attempt + 1}/{max_retries})")
                import time
                time.sleep(wait)
                continue
            raise

Pipeline authors should treat documentId values as ephemeral: each one expires with its source file after 24 hours, so pipeline code that caches documentId values between sessions will see 404s on every convert call, and re-uploading is always the correct recovery path.

The Complete Script

Set your environment variables, then run python convert.py input.docx output.pdf:

import os
import json
import time
import sys
import requests

CLIENT_ID = os.environ.get("CLIENT_ID")
CLIENT_SECRET = os.environ.get("CLIENT_SECRET")
BASE_URL = "https://na1.fusion.foxit.com"


def call_with_retry(fn, *args, max_retries: int = 3, **kwargs):
    for attempt in range(max_retries + 1):
        try:
            return fn(*args, **kwargs)
        except requests.HTTPError as e:
            code = e.response.status_code
            if code == 400:
                raise ValueError(
                    "Bad request. Confirm the input is a supported Word format."
                ) from e
            if code == 401:
                raise PermissionError(
                    "Authentication failed. Check CLIENT_ID and CLIENT_SECRET env vars."
                ) from e
            if code == 404:
                raise FileNotFoundError(
                    "Document or task expired (24h TTL). Re-upload and retry."
                ) from e
            if code == 413:
                raise OverflowError(
                    "File too large. The upload cap is 100 MB."
                ) from e
            if code == 500 and attempt < max_retries:
                wait = 2 ** attempt  # 1s, 2s, 4s
                print(f"Server error. Retrying in {wait}s ({attempt + 1}/{max_retries})")
                time.sleep(wait)
                continue
            raise


def upload_doc(file_path: str) -> str:
    url = f"{BASE_URL}/pdf-services/api/documents/upload"
    headers = {"client_id": CLIENT_ID, "client_secret": CLIENT_SECRET}
    with open(file_path, "rb") as f:
        files = {"file": (os.path.basename(file_path), f)}
        r = requests.post(url, headers=headers, files=files)
    r.raise_for_status()
    return r.json()["documentId"]


def convert_to_pdf(document_id: str) -> str:
    url = f"{BASE_URL}/pdf-services/api/documents/create/pdf-from-word"
    headers = {
        "client_id": CLIENT_ID,
        "client_secret": CLIENT_SECRET,
        "Content-Type": "application/json",
    }
    r = requests.post(url, headers=headers, data=json.dumps({"documentId": document_id}))
    r.raise_for_status()
    return r.json()["taskId"]


def poll_task(task_id: str, max_attempts: int = 30) -> str:
    url = f"{BASE_URL}/pdf-services/api/tasks/{task_id}"
    headers = {"client_id": CLIENT_ID, "client_secret": CLIENT_SECRET}
    for attempt in range(max_attempts):
        r = requests.get(url, headers=headers)
        r.raise_for_status()
        data = r.json()
        status = data.get("status")
        print(f"[{attempt + 1}/{max_attempts}] status={status}, progress={data.get('progress', 0)}%")
        if status == "COMPLETED":
            return data["resultDocumentId"]
        if status == "FAILED":
            raise RuntimeError(f"Conversion failed: {data.get('error')}")
        time.sleep(2)
    raise TimeoutError(f"Task {task_id} did not complete after {max_attempts} attempts")


def download_result(result_document_id: str, output_path: str) -> None:
    url = f"{BASE_URL}/pdf-services/api/documents/{result_document_id}/download"
    headers = {"client_id": CLIENT_ID, "client_secret": CLIENT_SECRET}
    with requests.get(url, headers=headers, stream=True) as r:
        r.raise_for_status()
        with open(output_path, "wb") as f:
            for chunk in r.iter_content(chunk_size=8192):
                if chunk:
                    f.write(chunk)


def convert_docx_to_pdf(input_path: str, output_path: str) -> None:
    print(f"Uploading {input_path}...")
    document_id = call_with_retry(upload_doc, input_path)
    print(f"Uploaded. documentId={document_id}")

    print("Initiating conversion...")
    task_id = call_with_retry(convert_to_pdf, document_id)
    print(f"Queued. taskId={task_id}")

    print("Polling for completion...")
    result_document_id = call_with_retry(poll_task, task_id)
    print(f"Completed. resultDocumentId={result_document_id}")

    print(f"Downloading to {output_path}...")
    call_with_retry(download_result, result_document_id, output_path)
    print("Done.")


if __name__ == "__main__":
    if len(sys.argv) != 3:
        print("Usage: python convert.py <input.docx> <output.pdf>")
        sys.exit(1)
    convert_docx_to_pdf(sys.argv[1], sys.argv[2])

The Foxit PDF Services API also supports merging, compression, linearization, and OCR through additional endpoints. All of them share the same host and header-based auth pattern, so the functions you’ve built here extend naturally as your pipeline grows.

Create your free Foxit developer account and run your first conversion in under five minutes, with no credit card required at signup.

DOCX to PDF API FAQ

Yes. The /pdf-services/api/documents/create/pdf-from-word endpoint accepts .doc, .docx, .rtf, .dot, .dotx, .docm, .dotm, and .wpd. The same four-step flow applies for all of them.

Uploaded documents are automatically deleted after 24 hours. Treat documentId values as ephemeral and re-upload whenever you need to convert a file after that window.

Faster polling consumes rate limit budget without affecting conversion speed. The server determines conversion time based on document complexity and queue load, so polling intervals below 2 seconds add no throughput benefit.

Yes. Each upload returns an independent documentId and each conversion returns an independent taskId. Run concurrent conversions by launching multiple threads or async tasks, with each one tracking its own taskId. Python’s concurrent.futures.ThreadPoolExecutor is a straightforward way to manage this.

From the Foxit Developer Portal dashboard, under the default application created at signup. Both values are available immediately after account creation.

The API authenticates through named request headers. Pass client_id and client_secret directly on every request, and the server reads those credentials on each call.

Document Generation Explained: How the Template-to-API Pipeline Actually Works

Document generation pipeline showing a Word template and JSON payload merging through the Foxit DocGen API into a finished PDF.

Manual document workflows break down fast as volume grows. This guide explains what document generation is, how template-driven APIs replace manual processes, and what the pipeline looks like from a Word template and JSON payload to a finished PDF.

You’ve inherited a document workflow built on Word macros, save-as duplication, and a shared drive folder someone named “FINAL_v3.” Every time a contract needs to go out, someone opens the master template, manually replaces the client name and date, exports to PDF, and emails it. Scaled to one deal a week, that works. Scaled to a thousand deals a quarter, it breaks down in ways that are hard to trace and painful to fix.

Document generation APIs make the relationship between input data and output document deterministic. This article covers how that pipeline is structured, what the token contract between template and data looks like, and what the actual API call looks like from a POST request to a decoded file.

What Document Generation Is

Document generation is the programmatic production of populated, formatted documents from a template and a structured data source. Three components are always present: a template (the structure and placeholders), a data payload (the values to inject), and a rendering engine (the API that merges the two and produces the final file).

These components map cleanly onto a separation of concerns. The template owner controls layout, language, and branding. The data owner controls what goes in each field. The rendering engine enforces the merge contract between them. When that separation holds, changing the template doesn’t require a code change, and changing the data schema requires only a template update.

Document generation occupies a distinct category from the adjacent tools that crowd the same search space. E-signature platforms collect binding signatures on completed documents. Document management systems store, version, and retrieve files. OCR and extraction tools pull structured data out of existing documents. Document generation puts data in.

Why Manual Document Workflows Break Under Load

Manual Word-based workflows fail in three predictable ways when volume grows.

Version drift happens when templates live in shared folders across teams. One team updates the disclaimer text, another adds a new liability clause, and a third is still using a version from Q3. After six months, your organization has five variants of the same contract template producing inconsistent output, with no reliable way to identify which version generated any given document.

Merge errors compound at scale. Copy-paste and mail-merge workflows both require human coordination of field-by-field substitution. When an invoice ships with the previous client’s name, or a renewal letter shows last year’s rates, the error traces back to a single manual step that had no validation layer. A 0.5% error rate is invisible at 20 documents a month. At 4,000 documents a month, it means 20 wrong documents going out the door.

Audit trail absence creates compliance exposure. A manually assembled document carries no machine-readable record of what data produced it or when. When a regulator asks which policy documents were generated from the October 2024 rate table, the answer requires a manual search through email threads and file name timestamps.

Programmatic generation makes each of these problems tractable. The same JSON payload processed against the same template version always produces the same output, and every generation event is a logged API call with a traceable input and output. The document generation software market is valued at $4.05B in 2025 and projected to grow at a 9.2% CAGR through 2035, driven by enterprise automation and compliance requirements that manual workflows can’t satisfy at scale.

How Template-Driven Generation Works: Tokens, Loops, and Conditionals

Foxit’s DocGen API uses standard Microsoft Word as the template authoring environment. Template authors work in Word, inserting double-bracket placeholders whose key names map directly to keys in the JSON payload supplied at generation time. This keeps template ownership with the people who understand the document content, with no proprietary editor and no additional software license required.

The basic token syntax covers three common cases:

  • {{ companyName }} renders the string value of companyName from the payload
  • {{ invoiceDate \@ MM/dd/yyyy }} applies a Word date picture string to the raw date value (the leading \@ is required; the friendly form without it renders blank)
  • {{ totalDue \# "$#,##0.00" }} formats a numeric value as currency using a Word numeric picture string (a friendly keyword like \# Currency is unsupported and renders blank)

A minimal JSON payload for a template containing those three tokens would be:

{
  "companyName": "Meridian Analytics",
  "invoiceDate": "2025-06-15",
  "totalDue": 8250.0
}

The rendering engine walks the template, matches each token to the corresponding key in the payload, and writes the formatted value into the output. Template authors work entirely in Word, while the engine handles token resolution, format application, and output assembly.

Repeating sections use loop delimiters to produce table rows that repeat for each element in a JSON array. Placing {{TableStart:lineItems}} before a table row and {{TableEnd:lineItems}} after it tells the engine to emit one row per object in the lineItems array. Both delimiters must sit in cells of the same Word table row. Inside that loop, {{ROW_NUMBER}} auto-increments across rows, and a footer row immediately below the loop can use {{=SUM(ABOVE) \# "$#,##0.00"}} to compute and format a column total, so a ten-line invoice produces a correctly numbered, fully summed table with no post-processing.

Conditional content uses Word’s native Field Code View (opened with ALT + F9) to write IF-field conditions that show or hide text blocks based on data values. A clause that should appear only when contractType equals "enterprise" lives inside a field condition, and the rendering engine evaluates it at generation time. There’s no separate scripting layer and no custom expression language to learn.

The three components converge at a single API endpoint: your base64-encoded DOCX template and JSON data payload go in together, and the generated document comes back in the same HTTP response.

Document generation pipeline diagram showing a Word template with tokens and a JSON data payload feeding a POST to GenerateDocumentBase64, which passes through the Foxit DocGen rendering engine and returns a synchronous JSON response decoded into a final PDF or DOCX file.

The API Pipeline: From POST Request to Final Document

The Foxit DocGen API compresses the generation pipeline into a single synchronous call. You POST to one endpoint with your template and data, and you receive the generated document in the same HTTP response, with no separate template upload step, no job ID to poll, and no webhook to configure for individual document generation.

Before building a generation workflow, you can use the Analyze Document API to scan a DOCX template and return a list of all embedded tokens, which confirms the token-to-key mapping before you commit to a data schema. That’s a single POST to a separate endpoint on the same host, and it returns a structured list of placeholder names and their types.

For the generation call itself, the dev-tier endpoint is:

POST https://na1.fusion.foxit.com/document-generation/api/GenerateDocumentBase64

Authentication passes your client_id and client_secret as custom HTTP headers alongside Content-Type: application/json. You retrieve both credentials from the dashboard at account.foxit.com/site/sign-up after activating your free developer plan, with no OAuth exchange and no session setup required before your first call.

The request body takes three fields:

  • base64FileString is your DOCX template, base64-encoded. Keep the source .docx under 4 MB (the practical ceiling for a single request) since base64 encoding inflates the payload by roughly 33%. If a template runs large, embedded images are usually the cause, so compress them through Word’s Picture Format settings before exporting.
  • documentValues is the JSON object whose keys map to token names in the template
  • outputFormat is the string "pdf" or "docx", lowercase and exact (the API returns HTTP 500 for any other value, including "PDF" or "DOCX")

For an invoice template with a lineItems array, the full curl command carries your base64-encoded DOCX, the matching data object, and your credentials in the request headers:

curl -X POST "https://na1.fusion.foxit.com/document-generation/api/GenerateDocumentBase64" \
  -H "client_id: YOUR_CLIENT_ID" \
  -H "client_secret: YOUR_CLIENT_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
    "base64FileString": "BASE64_ENCODED_DOCX_HERE",
    "documentValues": {
      "companyName": "Meridian Analytics",
      "invoiceDate": "2025-06-15",
      "totalDue": 8250.00,
      "lineItems": [
        { "description": "Platform License", "quantity": 1, "unitPrice": 8250.00 }
      ]
    },
    "outputFormat": "pdf"
  }'

The API returns a synchronous JSON response carrying a human-readable message, the fileExtension for naming the output file, and the  base64FileString containing the generated document:

{
  "message": "PDF Document Generated Successfully",
  "fileExtension": "pdf",
  "base64FileString": "JVBERi0xLjQK..."
}

message gives you a status string for logging. fileExtension tells you whether you received "pdf" or "docx", which lets you construct the output filename programmatically without parsing the message string. base64FileString is the generated document. Your application decodes that value and routes the resulting bytes to storage, email delivery, a document management system, or whatever downstream step your workflow requires.

For teams evaluating the API before writing integration code, the Foxit developer portal includes a Postman collection with GenerateDocumentBase64 preconfigured. You can load the collection, paste your credentials and a test template, send the request, and confirm the response structure before writing a single line of application code. Foxit also provides SDKs for Node.js, Python, Java, C#, and PHP if you prefer a language-native integration path over raw HTTP.

Where Organizations Are Actually Using This

Five industries where the template-to-API pipeline produces direct operational value:

Insurance carriers face renewal cycles that generate thousands of policy documents each quarter. They pull policyholder data from their CRM, pass it as a JSON payload to the generation API, and produce populated renewal packets without manual assembly. Each document reflects current policy terms from the system of record, cutting the per-document preparation time from minutes of human work to milliseconds of API latency.

Healthcare providers need patient intake packets and HIPAA disclosure forms ready before each appointment. Clinics pull patient demographic and consent data from their EHR system, generate the packet at appointment scheduling time, and deliver it to the patient portal. The data source is the live EHR record, so the document always reflects current information.

Government agencies and courts produce case-related documents (orders, notices, motions) with fixed structure and variable case data. API-based generation means each document draws directly from structured court records, reducing transcription errors and producing a machine-readable audit trail that links every output document to the specific data that created it.

High-tech and SaaS companies trigger NDA and quote generation directly from their CRM or CPQ tools. A deal record in Salesforce or HubSpot becomes the JSON payload, and the generation API produces a finalized, formatted document without a manual drafting step. The document lands in the deal room minutes after the pricing conversation closes.

Education institutions generate hundreds or thousands of admission offer letters during enrollment periods. Student data from the Student Information System becomes the payload, and each letter reflects the correct program, scholarship amount, and enrollment deadline for that individual student. What a staff member would take days to produce manually runs as a scheduled batch job.

What to Evaluate When Choosing a Document Generation API

Output format coverage deserves attention before you commit to an API. Receiving both PDF and DOCX from the same endpoint matters when your workflow requires human review before a document is finalized. PDF suits direct delivery. DOCX suits draft-and-review cycles where a lawyer or editor needs to modify the generated document before it goes to signature. APIs that produce only PDF force you to finalize at generation time, which eliminates the review step entirely.

The execution model determines whether an API fits your latency requirements. Synchronous APIs return the document in the same HTTP response, which works for real-time generation triggered by a user action or a CRM event. Asynchronous APIs accept a job and require polling or a webhook to retrieve the result, which works better for batch jobs processing thousands of documents in a run. Confirm which model the API offers before you design your integration, because retrofitting from synchronous to async (or vice versa) affects how you handle errors, retries, and downstream routing. Foxit’s DocGen API is synchronous, so individual requests resolve in a single HTTP round trip.

Template portability determines your long-term maintenance cost. A template stored as a standard DOCX file is editable by anyone with Word, version-controllable in Git, and portable across environments. A template stored in a proprietary format requires the vendor’s editor for every update, and losing access to that editor means losing the ability to maintain your own document logic. Word-based templates also let business users own content changes without involving a developer.

Compliance posture matters as soon as your documents contain PII, PHI, or financial data. Confirm the provider’s certifications before sending real records through the API. SOC 2, GDPR compliance, and HIPAA certification are the relevant checks for most enterprise document workflows. A provider’s architecture (multi-tenant SaaS vs. single-tenant hosted) also affects how you address data residency requirements.

Developer onboarding cost is a real selection criterion. A free tier with immediate credential access, a working Postman collection, and SDK support for your language stack lets your team validate fit in hours. A procurement process that requires a sales conversation before you can run a test extends your evaluation cycle by weeks. Foxit’s developer plan is free, credit-card-free, and gives you dashboard access and credentials immediately, so your team can make an informed build-vs-buy decision based on a working integration rather than a demo.

Getting Started: Generate Your First Document Today

A working end-to-end generation pipeline takes under an hour to set up:

  1. Go to account.foxit.com/site/sign-up and activate a free Developer plan. Dashboard access and credentials are immediate, with no credit card and no sales call required.

  2. Retrieve your Client ID and Client Secret from the dashboard.

  3. Grab a ready-to-use template, or author your own. Download invoice_simple.docx for the smallest possible smoke test, or invoice_table.docx if you want the full loop, {{ROW_NUMBER}}, and {{=SUM(ABOVE)}} round-trip. To build your own instead, open Microsoft Word, add two or three {{ tokenName }} placeholders, and save as DOCX. Base64-encode the file using base64 -i template.docx on macOS or Linux, or [Convert]::ToBase64String([IO.File]::ReadAllBytes("template.docx")) in Windows PowerShell.

  4. Open the Postman collection linked on the Foxit API page, load the GenerateDocumentBase64 request, paste your base64-encoded template into base64FileString, and add a JSON object to documentValues with keys matching your placeholder names. Set outputFormat to "pdf" and send the request. Copy the base64FileString value from the JSON response and decode it. You now have a generated PDF.

From this point, connecting to a real data source is the only remaining step. Pull a record from your CRM, a row from your database, or a response from an upstream API, map its fields to the token names in your template, and pass the result as documentValues. That connection makes the pipeline production-ready, and every document it generates becomes a deterministic, auditable function of the data that produced it.

Activate your free Foxit Developer plan (no credit card, no sales call) and run your first document generation call in minutes at account.foxit.com/site/sign-up.

Document Generation Pipeline FAQ

Document generation is the programmatic production of populated, formatted documents from a template plus a structured data source. Three components are always present: a template holding the layout and placeholders, a JSON data payload holding the values, and a rendering engine that merges the two into a final PDF or DOCX. The same payload against the same template always produces the same output, which makes the process deterministic and auditable.

They occupy adjacent but distinct categories. E-signature platforms like DocuSign collect binding signatures on documents that already exist. Document management systems store, version, and retrieve files. OCR and extraction tools pull structured data out of existing documents. Document generation does the opposite — it puts data in, producing a new populated document from a template. Many workflows chain them: generate a contract, then route it for signature.

You POST to GenerateDocumentBase64 with three fields: base64FileString (your base64-encoded DOCX template), documentValues (the JSON object whose keys match your tokens), and outputFormat ("pdf" or "docx", lowercase). Authentication uses client_id and client_secret headers. The synchronous response returns a message, a fileExtension, and a base64FileString containing the generated document, which your app decodes and routes downstream.

Use loop delimiters. Place {{TableStart:lineItems}} before a Word table row and {{TableEnd:lineItems}} after it, with both delimiters in cells of the same row. The engine emits one row per object in the lineItems array. Inside the loop, {{ROW_NUMBER}} auto-increments, and a footer row can use {{=SUM(ABOVE) \# "$#,##0.00"}} to compute and format a column total — so a ten-line invoice renders fully numbered and summed with no post-processing.

Yes. The Foxit DocGen API returns either format from the same endpoint based on the outputFormat value. PDF suits direct delivery where the document is final at generation time. DOCX suits draft-and-review cycles, where a lawyer or editor needs to modify the generated file before it goes to signature. APIs that emit PDF only force you to finalize at generation and eliminate that review step.

PDF Translation with Verifiable Quality: Build a Confidence-Scored Pipeline with Foxit API and Straker.ai

Architecture diagram of a PDF translation API pipeline using Foxit and Straker.ai with per-segment confidence scoring.

Most machine translation tools hand back a translated PDF with no signal about which parts to trust — a real problem for contracts, medical forms, and regulatory filings. This guide shows how to build a pipeline that scores every segment before the final render, using Foxit for structural extraction and layout-preserving rendering and Straker.ai for translation plus per-segment quality scoring.

Most machine translation tools give you a translated file and nothing else. They do not tell you which parts are correct and which parts are wrong. For a simple blog post, that is fine. For a contract, a medical form, or a legal notice, it is a real problem. A bad translation can sit in the final PDF for days before anyone notices, often only after the document has already been signed or sent.

Teams today are translating more documents, into more languages, and faster than ever. Legal, finance, healthcare, HR, and insurance teams all deal with PDFs where one wrong word can cause a lot of damage: a broken contract, a failed audit, or even a safety issue. Most translation tools were not built to catch these mistakes. They just move text from one language to another. When quality checks happen at all, they usually mean a person reading the final PDF line by line and hoping they spot the errors.

This article shows how to build a better setup. You will learn how to build a PDF translation pipeline that gives every segment a quality score before the final PDF is created. Instead of hoping the translation is right, the pipeline tells you which parts to trust, which parts to review, and which parts to send back to a human translator. All of this happens automatically on every run.

Architecture at a Glance

Before going deeper, it helps to see the full pipeline in one picture. The diagram below traces a source PDF through every stage: extract, translate, score, route, and render. Each box is a single responsibility handled by a single service, with the routing layer acting as the glue you control.

High-level PDF translation API architecture showing source PDF flowing through Foxit structural extract, Straker AI translate and score, routing layer for accept/flag/reject, Foxit layout-preserving render, and final translated PDF.

The pipeline has two external services:

  • Foxit PDF Translation API handles anything PDF-specific. It pulls the structured text out of the source document with element IDs attached, then renders the final PDF back in the original layout (multi-column text, tables, font substitution, image positions) using the approved translations.
  • Straker AI translates each source segment AND scores the translation in the same request. It returns the target text, a numeric score on a 0.0 to 1.0 scale, and a categorical label (bestgoodacceptablebad) for every element ID. This step is pluggable, so you can swap Straker for DeepLGoogle Cloud TranslationAWS Translate, or an in-house NMT if you already have a contract with one of them. The contract between this step and the rest of the pipeline is a flat dict of element IDs to translated text plus per-segment scores.

and one piece of code you own:

  • Routing layer is your business logic. It reads the score, decides whether the segment auto-accepts, flags for human review, or escalates to a translator, and then hands the approved set to Foxit’s render call.

With the shape of the pipeline on the table, the rest of the article works through each piece in order, starting with why per-segment quality scoring is worth the integration effort in the first place.

The Quality Gap

You ship a translated PDF to a legal team. Three days later, compliance flags a clause in the German version. The term “indemnification” was rendered as “Entschädigung” (compensation) rather than “Freistellung” (hold harmless). Your MT pipeline returned a 200 status. Nobody’s alerting on that delta.

Raw machine translation output carries no quality signal by default. Every segment comes back translated, and your pipeline treats them identically regardless of whether the model was confident or guessing. For marketing copy that’s an acceptable tradeoff, but for a loan covenant, a clinical trial protocol, or a regulatory filing, a 95%-accurate translation can still be contractually or legally dangerous because the 5% failure may concentrate precisely in the high-stakes clauses.

A confidence score, in the translation QA context, is a per-segment numeric signal from a verification engine. It tells you how reliable each translated unit is on a scale your system can act on programmatically. High-confidence segments auto-accept, medium-confidence ones queue for post-edit review, and low-confidence segments escalate directly to a human translator before they ever reach the final document.

The compound problem for PDFs specifically is that most translation pipelines strip document structure before the MT engine even sees the text. The extraction step flattens multi-column layouts, collapses table cells, and drops font metadata. By the time you get a translated output, you’ve lost both layout fidelity and any quality signal. The rendered PDF looks wrong and you have no programmatic way to know which segments caused it.

Foxit’s PDF Translation Trial API extracts structured text from a source PDF with element IDs preserved, so the layout blueprint travels alongside the text through the entire workflow. You hand the source segments to Straker AI, which returns the translated text plus a per-segment numeric score and a quality label in a single call. (If you already run DeepL, Google Cloud Translation, AWS Translate, or an in-house NMT Engine, you can drop it in at this step without changing the rest of the pipeline.) Your routing logic decides which segments pass, which get flagged, and which escalate to human review. Foxit’s render endpoint then re-assembles the PDF in the original layout using the accepted translations, giving you a layout-preserved translated PDF with a documentable quality trail attached to every segment.

How the Pipeline Works

Foxit and Straker are two independent APIs that you wire together. Foxit owns PDF structure, extracting structured text keyed by element ID and re-rendering the final PDF in the original layout. Straker AI handles translation and per-segment quality scoring in a single request, returning the translated text alongside a numeric score and a quality label. You own the routing decision that sits between the scores and the render call.

The pipeline runs in seven steps:

Seven-step PDF translation API pipeline: upload source PDF, structural extract, preprocess, translate and score with Straker AI, route by score, render, and download the translated PDF.

Foxit covers steps 1-3 and 6-7 (PDF structure and rendering). Straker AI covers step 4, producing translations and per-segment quality scores in one round-trip. Step 5 is your business logic.

The Foxit PDF Translation API defines steps 2, 3, and 6. The upload and download calls use the general PDF Services endpoints. Straker AI is a separate API at https://api-verify.straker.ai. You submit XLF 1.2 files containing source segments and Straker returns the translated target_text per segment plus a numeric score (0.0 to 1.0) and a quality label (bestgoodacceptablebad). Because Foxit’s ExtractedText.json is a flat { "elementId": "text" } map, and XLF trans-unit IDs round-trip through Straker’s external_id field unchanged, the element IDs Foxit emits are the same IDs that come back with translations and scores attached. That alignment is what makes programmatic routing possible.

One clarification for readers who’ve seen the Foxit-Straker partnership announcement: that partnership covers Foxit eSignature Services, enabling end users to translate and sign documents in the eSign product. That’s an end-user feature. The PDF Translation Trial API used here is a separate developer surface. Its OpenAPI spec (v2.2.0) contains zero Straker references, and the preprocess-pdf documentation explicitly instructs developers to “translate the text in ExtractedText.json using your preferred translation tool.” You wire the two APIs together manually. This tutorial uses Straker AI as the default translation engine because it produces translations and quality scores in the same call, but you can substitute DeepL, Google Cloud Translation, AWS Translate, or your own NMT at step 4 without changing the Foxit calls.

Credentials and Setup

Get your Foxit credentials at app.developer-api.foxit.com/pricing. The free Developer plan gives you 20 AI credits per month with no credit card and no sales call required. Once you’ve signed in, your Client ID and Client Secret appear in the developer dashboard. Every Foxit API call requires both in the request headers as client_id and client_secret (lowercase snake_case). Export them in your shell as FOXIT_CLIENT_ID and FOXIT_CLIENT_SECRET so the code below reads them from the environment rather than hard-coding secrets.

For Straker, sign up at straker.ai/ai-platform/verify for API access. Straker issues a UUID-style API token that you send as a bearer token on every call (Authorization: Bearer <your-token>). The API lives at https://api-verify.straker.ai and its full reference is published at api-verify.straker.ai/docs. Export your token as STRAKER_API_KEY for the code below. You can confirm the token works and check your balance with a quick GET /user/balance. Both services offer trial access, so you can build and test the full pipeline before any procurement conversation.

Before you finalize your language matrix, check both APIs for supported languages. Foxit’s render endpoint accepts 23 target language codes (enzhzh_twfrdeesitptnljakothvihiruartrplsvnonbda, and fi). Straker AI identifies languages by UUID rather than ISO code. You fetch the full list with GET /languages and look up the UUID for your target (for example, 917FF728-0725-A033-1278-33025F49CA40 is French (France), 917FF7D8-9107-0BF8-97EE-065C20F453DE is German). The intersection of the two sets determines your production language coverage.

If you already have a contract with DeepL, Google Cloud Translation, AWS Translate, or an in-house NMT service, you can swap that engine in at step 4. The pipeline contract upstream (Foxit element IDs mapped to source strings) and downstream (a dict of {element_id: {score, quality, target_text}} feeding the router) does not change. The code below uses Straker AI by default because the same API returns the translation and the quality signal in one call.

Building the PDF Translation Pipeline

The complete seven-step pipeline runs in Python using requestsjsonzipfileos, and the standard-library xml.etree.ElementTree for building XLF. The first snippet covers Foxit steps 1-3 (upload, structural extraction, and preprocessing).

import requests
import json
import zipfile
import io
import time

FOXIT_BASE = "https://na1.fusion.foxit.com/pdf-services/api"
HEADERS = {
    "client_id": "YOUR_CLIENT_ID",
    "client_secret": "YOUR_CLIENT_SECRET"
}

def poll_task(task_id: str) -> dict:
    """Poll GET /tasks/{task_id} until COMPLETED or FAILED."""
    while True:
        r = requests.get(f"{FOXIT_BASE}/tasks/{task_id}", headers=HEADERS)
        r.raise_for_status()
        data = r.json()
        status = data.get("status")
        if status == "COMPLETED":
            return data
        if status == "FAILED":
            raise RuntimeError(f"Task {task_id} failed: {data.get('error')}")
        # PENDING or IN_PROGRESS: wait and retry
        time.sleep(3)

# Step 1: Upload source PDF
with open("source.pdf", "rb") as f:
    upload_resp = requests.post(
        f"{FOXIT_BASE}/documents/upload",
        headers=HEADERS,
        files={"file": ("source.pdf", f, "application/pdf")}
    )
upload_resp.raise_for_status()
source_document_id = upload_resp.json()["documentId"]

# Step 2: Structural Extract (async - must complete before preprocess)
extract_resp = requests.post(
    f"{FOXIT_BASE}/documents/pdf-structural-extract",
    headers=HEADERS,
    json={"documentId": source_document_id}
)
extract_resp.raise_for_status()  # 202 Accepted
extract_task_id = extract_resp.json()["taskId"]

extract_result = poll_task(extract_task_id)
extracted_doc_id = extract_result["resultDocumentId"]

# Step 3: Preprocess (synchronous - returns 200, no polling needed)
preprocess_resp = requests.post(
    f"{FOXIT_BASE}/documents/translation/preprocess-pdf",
    headers=HEADERS,
    json={"documentId": extracted_doc_id}
)

# Errors from preprocess-pdf per the Foxit spec:
#   400 VALIDATION_ERROR      - "Document ID is required"
#   500 INTERNAL_SERVER_ERROR - "Failed to preprocess document"
preprocess_resp.raise_for_status()
preprocess_result_id = preprocess_resp.json()["resultDocumentId"]

# Download the ZIP containing ExtractedText.json and StructureInfo.json
zip_resp = requests.get(
    f"{FOXIT_BASE}/documents/{preprocess_result_id}/download",
    headers=HEADERS
)
zip_resp.raise_for_status()

with zipfile.ZipFile(io.BytesIO(zip_resp.content)) as zf:
    extracted_text = json.loads(zf.read("ExtractedText.json"))
    # StructureInfo.json: do not modify - the render step requires it untouched
    # structure_info = json.loads(zf.read("StructureInfo.json"))

# extracted_text is now {"elementId1": "original text", "elementId2": "original text", ...}

The preprocess step is synchronous, which means you get a 200 OK directly with the resultDocumentId. No polling required. The ZIP it produces contains two files: ExtractedText.json maps every element ID to its original text, and StructureInfo.json carries the full layout blueprint (bounding boxes, font metadata, column positions). You pass StructureInfo.json to the render step unmodified. Modifying it breaks the render because it’s the mechanism that makes layout preservation possible.

The second snippet covers steps 4-7, calling Straker AI to translate and score every segment in one round-trip, routing by score, rendering the translated PDF, and downloading the result. Straker’s AI Translation and Quality Evaluation workflow accepts a source-only XLF and returns a translated target_text per segment alongside the numeric score and the quality label, so the same response feeds both the translation choice and the routing decision.

import xml.etree.ElementTree as ET

STRAKER_BASE = "https://api-verify.straker.ai"
STRAKER_TOKEN = "STRAKER_API_KEY"
STRAKER_HEADERS = {"Authorization": f"Bearer {STRAKER_TOKEN}"}

# Straker identifies languages by UUID. Look these up once via GET /languages
# and cache them. Full list: https://api-verify.straker.ai/languages
STRAKER_LANG_FRENCH = "917FF728-0725-A033-1278-33025F49CA40"
STRAKER_LANG_GERMAN = "917FF7D8-9107-0BF8-97EE-065C20F453DE"

# Workflow UUID for "AI Translation and Quality Evaluation". Fetch the full
# list of workflows once via GET /workflow and cache the UUID for the one you
# want; this workflow produces both the translation and the per-segment score.
STRAKER_WORKFLOW_AI_TRANSLATE_AND_EVAL = "390b47a9-d5dc-46ae-92e2-56c43d128c44"


def build_xlf_1_2_source_only(source_lang: str, target_lang: str,
                              sources: dict) -> bytes:
    """
    Build a minimal XLF 1.2 document with source segments and empty targets.
    trans-unit/@id preserves Foxit's element IDs; Straker surfaces the same
    value as `external_id` on the segments it returns, so the keys round-trip.
    """
    ns = "urn:oasis:names:tc:xliff:document:1.2"
    ET.register_namespace("", ns)
    xliff = ET.Element(f"{{{ns}}}xliff", {"version": "1.2"})
    file_el = ET.SubElement(xliff, f"{{{ns}}}file", {
        "source-language": source_lang,
        "target-language": target_lang,
        "datatype": "plaintext",
        "original": "foxit-extract",
    })
    body = ET.SubElement(file_el, f"{{{ns}}}body")
    for element_id, source_text in sources.items():
        unit = ET.SubElement(body, f"{{{ns}}}trans-unit", {"id": element_id})
        ET.SubElement(unit, f"{{{ns}}}source").text = source_text
        ET.SubElement(unit, f"{{{ns}}}target")  # empty - Straker fills it in
    return b'<?xml version="1.0" encoding="UTF-8"?>\n' + ET.tostring(xliff, encoding="utf-8")


# Step 4: Translate and score every segment with Straker AI in one call.
def translate_and_score_with_straker(sources: dict, source_lang_code: str,
                                     target_lang_uuid: str) -> dict:
    """
    Submit source-only XLF to Straker's AI Translation + Quality Evaluation
    workflow. Returns a dict keyed by Foxit element ID ->
    {"score": float|None, "quality": str, "target_text": str}.
    """
    xlf_bytes = build_xlf_1_2_source_only(source_lang_code, "fr", sources)

    # 4a. Create the project on the AI Translation + Quality Evaluation
    # workflow. confirmation_required=false commits the token cost
    # immediately; set to true to review cost and call POST /project/confirm
    # before processing begins.
    create_resp = requests.post(
        f"{STRAKER_BASE}/project",
        headers=STRAKER_HEADERS,
        files={"files": ("segments.xlf", xlf_bytes, "application/xliff+xml")},
        data={
            "languages": target_lang_uuid,
            "title": "Foxit PDF translation batch",
            "workflow_id": STRAKER_WORKFLOW_AI_TRANSLATE_AND_EVAL,
            "confirmation_required": "false",
        },
    )
    create_resp.raise_for_status()
    project_id = create_resp.json()["project_id"]

    # 4b. Poll the project until it reports COMPLETED.
    while True:
        status_resp = requests.get(
            f"{STRAKER_BASE}/project/{project_id}", headers=STRAKER_HEADERS
        )
        status_resp.raise_for_status()
        project = status_resp.json()["data"]
        if project["status"] == "COMPLETED":
            break
        if project["status"] in ("FAILED", "PROCESSING_FAILED", "CANCELED"):
            raise RuntimeError(f"Straker project {project_id} failed")
        time.sleep(3)

    # 4c. Fetch the per-segment translations + scores. file_uuid is returned
    # in the project payload.
    file_uuid = project["source_files"][0]["file_uuid"]
    seg_resp = requests.get(
        f"{STRAKER_BASE}/project/{project_id}/segments/{file_uuid}/{target_lang_uuid}",
        headers=STRAKER_HEADERS,
    )
    seg_resp.raise_for_status()

    results = {}
    for seg in seg_resp.json()["segments"]:
        element_id = seg["external_id"]  # matches the Foxit key we packed into XLF
        t = seg["translation"]
        results[element_id] = {
            "score": t["score"],          # float 0.0 to 1.0, or None
            "quality": t["quality"],      # "best" | "good" | "acceptable" | "bad"
            "target_text": t["target_text"],  # Straker's translation
        }
    return results

scored = translate_and_score_with_straker(
    extracted_text,
    source_lang_code="en",
    target_lang_uuid=STRAKER_LANG_FRENCH,
)

# Step 5: Route by score and quality label (developer-controlled business logic).
HIGH_THRESHOLD = 0.85
LOW_THRESHOLD = 0.65

accepted = {}
flagged_for_review = {}
rejected = {}

for element_id, verdict in scored.items():
    score = verdict["score"] or 0.0
    if verdict["quality"] == "best" or score >= HIGH_THRESHOLD:
        accepted[element_id] = verdict["target_text"]
    elif verdict["quality"] == "bad" or score < LOW_THRESHOLD:
        rejected[element_id] = {"original": extracted_text[element_id],
                                 "score": score, "quality": verdict["quality"]}
    else:
        flagged_for_review[element_id] = {"translation": verdict["target_text"],
                                           "score": score, "quality": verdict["quality"]}

# Build the render payload. Foxit's render expects every key from the original
# ExtractedText.json. Accepted segments use the scored translation; flagged and
# rejected segments fall back to the original source text so the layout is not
# broken by missing keys. In production, replace the fallback with human-
# reviewed text once it is available, or hold the render step until review
# completes.
render_payload = {}
for element_id, original_text in extracted_text.items():
    if element_id in accepted:
        render_payload[element_id] = accepted[element_id]
    else:
        render_payload[element_id] = original_text

# Step 6: Render (async)
# translatedFile is the modified ExtractedText.json with translated values, same keys
translated_json_bytes = json.dumps(render_payload).encode("utf-8")

render_resp = requests.post(
    f"{FOXIT_BASE}/documents/translation/render-pdf",
    headers=HEADERS,
    data={
        "sourceDocumentId": source_document_id,
        "preprocessResultDocumentId": preprocess_result_id,
        "targetLanguage": "fr"
        # Optional: "pageRangeStart": 1, "pageRangeEnd": 10
    },
    files={"translatedFile": ("ExtractedText.json", translated_json_bytes, "application/json")}
)

# Errors from render-pdf per the Foxit spec:
#   400 VALIDATION_ERROR    - "Either translatedFile or translatedTextDocumentId must be provided"
#   400 VALIDATION_ERROR    - "Unsupported target language: xx"
#   500 RENDER_START_FAILED - "Failed to start render: service unavailable"
render_resp.raise_for_status()
render_task_id = render_resp.json()["taskId"]

render_result = poll_task(render_task_id)
output_doc_id = render_result["resultDocumentId"]

# Step 7: Download translated PDF
pdf_resp = requests.get(
    f"{FOXIT_BASE}/documents/{output_doc_id}/download",
    headers=HEADERS
)
pdf_resp.raise_for_status()
with open("translated_output.pdf", "wb") as f:
    f.write(pdf_resp.content)

print(f"Done. Accepted: {len(accepted)}, Flagged: {len(flagged_for_review)}, Rejected: {len(rejected)}")

The render call is multipart/form-data. You pass sourceDocumentId (the original PDF’s document ID from step 1), preprocessResultDocumentId (from step 3), targetLanguage (one of the 23 supported codes), and translatedFile (the modified ExtractedText.json with translated values and original keys). The alternative is uploading the translated JSON first via the upload endpoint and passing its ID as translatedTextDocumentId instead. At least one of the two must be present, or you’ll get a 400 VALIDATION_ERROR.

The render operation is asynchronous. It returns 202 Accepted immediately with a taskId, and the actual rendering runs in the background on Foxit’s side. You must poll GET /tasks/{taskId} on a fixed interval, every 3 seconds is the recommended cadence, until the status flips to COMPLETED before you try to download the output. Skipping the poll, or treating the initial 202 response as if it were a finished render, will cause the program to crash and interrupt the rest of the pipeline because the result document is not yet written when the task is still IN_PROGRESS. The poll_task helper from the first snippet already implements this loop with a 3-second time.sleep between checks and surfaces a FAILED status as a RuntimeError, so reuse it here rather than reading render_resp.json() directly. The same polling discipline applies to the structural extract step (step 2), which is also asynchronous.

Scoring and Routing

Straker AI generates both the translation and the quality signal in this pipeline. Foxit’s responses carry document IDs and task statuses; the translation choice and the per-segment score are entirely Straker’s contribution.

Each segment in the /project/{id}/segments/{file_id}/{language_id} response carries three values you care about. target_text is Straker’s translation. score is a float between 0.0 and 1.0 (it may be null for segments where the model has no confidence signal). quality is a categorical label Straker assigns alongside the numeric score (bestgoodacceptable, or bad). You can route on either signal, or combine them. The table below shows a combined policy calibrated for compliance-sensitive documents. These are starting points; your production system should calibrate per language pair and domain, since a French legal contract demands different thresholds than a Spanish marketing brochure.

Straker verdictActionRationale
quality == "best" or score >= 0.85Auto-accept, include in renderHigh confidence output; suitable for fully automated workflows
quality in ("good", "acceptable") or 0.65 - 0.84Flag segment by element ID for post-edit reviewMedium confidence; a human reviewer checks the flagged segments before the final render runs
quality == "bad" or score < 0.65Reject segment, escalate to human translatorLow confidence output; the model is unreliable for this segment

The element ID key structure matters here. Foxit’s ExtractedText.json keys are packed into XLF trans-unit IDs, and Straker surfaces the same value in its response’s external_id field. That means every entry in your flagged_for_review dictionary carries enough information for a reviewer to open the source document, find the exact element by ID, and return an approved translation. You write the approved translation back into the same key, then trigger the render step. This produces a documentable audit trail. For every element ID in the output PDF, you can show the original text, Straker’s translation, the Straker score and quality label, and whether a human approved it. In regulated industries (finance, legal, healthcare), that’s the evidence your compliance team needs to sign off on an automated localization workflow, and it aligns with ISO 18587, the international standard for post-editing of machine translation output.

Straker AI can also route low-confidence output to expert reviewers automatically when configured through the Straker platform. Check straker.ai/ai-platform/verify for the workflow configuration options.

Layout Preservation

Foxit’s render step preserves multi-column text flow, embedded table cell structure, images at their original positions, headers and footers, and font substitution for target-language character sets. That means CJK scripts (Japanese, Chinese, Korean) render correctly with appropriate glyph substitution, and Arabic output renders right-to-left without manual post-processing.

StructureInfo.json is what makes this possible. When the preprocess step runs, it produces both the text map (which you hand to Straker) and the layout blueprint (which you hand back to Foxit unmodified at render time). The render engine maps translated text back to the original element positions using this blueprint, reflowing text within the same bounding boxes. Because the structure data travels alongside the text through the entire pipeline, Foxit never needs to reconstruct the layout from scratch.

Generic MT pipelines export raw text, losing all spatial relationships, translate it, then attempt to rebuild the PDF from nothing. Tables merge into continuous text, columns collapse to a single flow, and CJK font substitution fails because the rebuilding step has no record of what fonts were originally in use.

Limitations to Test

Text expansion is the first limitation worth stress-testing. English to German translation typically increases text length by 20-35%, and English to Arabic can run even longer. Foxit’s render engine handles reflow within bounding boxes, but extreme length changes in tight table cells or narrow columns may overflow. Test with your actual document types before you commit to a production deployment.

Complex layout edge cases are the second limitation. Overlapping text boxes, embedded SVG charts with text labels, and PDFs with non-standard encoding may produce imperfect renders. The structural extraction step covers standard PDF text elements well, but edge-case layouts require manual review of the rendered output before you sign off on the pipeline for a given document class.

Try It Now

Sign up for Foxit’s free Developer plan and a Straker AI account, grab credentials for both, and run the pipeline from the section above against a real document. An invoice, a multi-page contract, or a regulatory filing works well for testing because each has tables, mixed-column layouts, and high-stakes text segments.

After the render completes, verify four things in the output PDF:

  • Tables retain cell structure
  • Multi-column text flows correctly in the target language
  • Images remain in their original positions
  • Fonts render correctly for the target script

Cross-reference the confidence scores from Straker against the rendered segments to calibrate your production thresholds. You may find that legal terminology in German warrants a 0.90 auto-accept threshold while product description text in French is fine at 0.80.

The complete Foxit Translation Trial API reference covers the full parameter list and response schema for preprocess-pdf and render-pdf. The Foxit Structural Extraction Trial API reference documents the structural extract endpoint. Straker’s translation and scoring API documentation lives at straker.ai/ai-platform/verify.

Looking ahead, Straker’s dashboard lists a native Foxit integration as Coming Soon (no release date announced at the time of writing), described as a workflow to translate PDF contracts with Foxit, verify them with experts, and finalize them for signing. When it ships, it’s likely to compress several of the manual steps above into a single call. The underlying mechanics (structural extract, translation, per-segment scoring, routing, render) will remain the same logical stages, so the pipeline you build today stays a useful mental model for reasoning about the native version when it arrives.

For production-scale implementation patterns and how Straker’s translation and verification layer integrates into enterprise localization pipelines, register for the upcoming joint Foxit + Straker.ai webinar with Lee Konstanty from Straker. Get your Foxit API credentials | Get started with Straker AI

PDF Translation API FAQ

A PDF translation API with confidence scoring is a service that translates PDF documents and returns a per-segment quality signal alongside each translation. Instead of handing back a single translated file, the API tells you which segments are high-confidence (safe to auto-accept), which are medium-confidence (queue for human review), and which are low-confidence (escalate to a translator). This pipeline combines Foxit’s PDF Translation Trial API for structural extraction and layout-preserving rendering with Straker.ai for translation and scoring in a single call.

The pipeline runs in seven steps: upload the source PDF to Foxit, run structural extraction to get element-ID-keyed text, preprocess to produce ExtractedText.json and StructureInfo.json, send segments to Straker AI’s “AI Translation and Quality Evaluation” workflow which returns translated text plus a 0.0–1.0 score and a quality label, route each segment programmatically by score, then call Foxit’s render endpoint to rebuild the PDF in the original layout. Foxit owns PDF structure, Straker owns translation and scoring, and your code owns the routing decision.

For marketing copy, raw machine translation output is usually fine. For contracts, medical forms, clinical trial protocols, or regulatory filings, a 95%-accurate translation can still be legally dangerous because the 5% failure may land on a high-stakes clause — like “indemnification” rendered as “Entschädigung” (compensation) instead of “Freistellung” (hold harmless). Per-segment confidence scores let you route low-confidence segments to human reviewers before they reach the final document, producing the audit trail compliance teams need under standards like ISO 18587.

Yes. The translation step is pluggable. The contract upstream — Foxit element IDs mapped to source strings — and downstream — a dict of element IDs to translated text feeding the render call — does not change if you swap the engine. DeepL, Google Cloud Translation, AWS Translate, or an in-house NMT engine all work. The trade-off is that Straker AI returns translation plus quality score in one call, while other engines require a separate verification step if you want confidence signals.

Foxit’s preprocess step produces two files: ExtractedText.json with element-ID-keyed text, and StructureInfo.json with the full layout blueprint (bounding boxes, font metadata, column positions, image locations). You modify only ExtractedText.json with translations and pass StructureInfo.json to the render endpoint untouched. The render engine reflows translated text within the original bounding boxes, handles font substitution for CJK and Arabic scripts, and preserves multi-column layouts, tables, and image positions — without rebuilding the PDF from scratch.

Foxit’s render endpoint accepts 23 target language codes: en, zh, zh_tw, fr, de, es, it, pt, nl, ja, ko, th, vi, hi, ru, ar, tr, pl, sv, no, nb, da, and fi. Straker AI identifies languages by UUID rather than ISO code, fetched via GET /languages. Your production language coverage is the intersection of both sets — check both APIs before finalizing your language matrix.

A reasonable starting policy for compliance-sensitive documents: auto-accept segments with quality == “best” or score >= 0.85, flag for post-edit review at 0.65–0.84 or quality in (“good”, “acceptable”), and reject for human translation at score < 0.65 or quality == “bad”. These are starting points — calibrate per language pair and domain. A French legal contract may warrant a 0.90 auto-accept threshold while a Spanish marketing brochure is fine at 0.80. Run the pipeline against a representative sample of your real documents and tune from there.

Extract Anything from Any PDF: Inside Foxit’s Advanced Extraction Engine

Foxit PDF Structural Extraction API engine extracting tables, forms, and text from scanned PDFs.

Basic PDF extraction libraries break on scanned documents, complex tables, and form fields, leaving downstream pipelines starved of clean data. Foxit’s PDF Structural Extraction API combines OCR, layout recognition, and AI parsing to return all twelve PDF element types as structured JSON, ready for RAG, BI, and CRM workflows.

Your PDF extraction pipeline passes unit tests against the sample invoices you built it on. Then production arrives and you’re looking at 47% garbled output on the Q4 contract batch because half those documents are scanned TIFFs wrapped in a PDF envelope, and your extraction library has no concept of what an image-only page actually is.

The failure modes are specific. PyMuPDF’s get_text() returns empty strings on scanned PDFs because it reads content streams directly, and image-only pages carry no text stream. pdfplumber’s table detection merges rows when column widths span non-uniform grids, which is standard in any financial statement that mixes summary and line-item rows on the same page. Embedded images containing meaningful text (stamped signatures, engineering drawing annotations, letterhead logos) get silently dropped. The library extracts coordinates for the XObject reference but does nothing with the raster data inside. Form fields built on non-standard annotation types (AcroForms using widget annotations with custom action streams) lose their values entirely when you serialize to text.

The architectural distinction that creates this problem is the difference between content serialization and semantic extraction. A PDF converter reads a content stream and writes out whatever character sequences it finds in rendering order. An extraction engine understands the spatial relationships between those character sequences: that two columns of text at x=72 and x=320 are parallel body copy, that the row at y=210 belongs to the table starting at y=180, that the text block repeating on every page is a header carrying lower retrieval weight in a RAG index. Output that lacks spatial and semantic classification looks correct on screen but breaks every downstream consumer that depends on structure.

BI dashboards require numbers tied to the right row labels. AI ingestion pipelines require heading hierarchy to chunk accurately. CRMs require form field values extracted from AcroForm widget dictionaries, delivered with field names intact. The delta between what basic extraction libraries return and what those systems can actually consume is where document pipeline engineering hours accumulate.

How Foxit’s PDF Structural Extraction Engine Works Under the Hood

Foxit exposes this capability as the PDF Structural Extraction (Trial) endpoint inside the PDF Services API (POST /pdf-services/api/documents/pdf-structural-extract). Trial status means the schema is versioned at v1.0.7 and may evolve, but the contract is stable enough to build against today, and the endpoint runs against the production base URL at developer-api.foxit.com.

The engine runs three coordinated layers. The OCR layer operates on rasterized page content, recognizing characters from image-based PDFs and scanned documents across 200+ languages. The layout recognition layer applies spatial analysis to identify column boundaries, reading order, table cell boundaries, figure regions, and header/footer zones. The AI-based parsing layer classifies extracted objects semantically, resolving ambiguous blocks (a text run that spans two layout columns, or a figure caption that reads syntactically like a section heading) into typed elements.

All three layers run inside Foxit’s core PDF engine, which powers 700 million+ users across 20+ years of production deployments. That engine has native awareness of PDF internal structures: content streams, XObject dictionaries, AcroForm field trees, and annotation layers. The OCR layer operates on the same internal page representation the rendering engine uses, so it handles annotated PDFs where text overlaps image regions, and form fields where the visual display and stored value diverge.

The same Structural Extraction endpoint is also Step 1 of Foxit’s PDF Translation (Trial) workflow, which signals that the extraction output is structured enough to backbone a full rewrite-and-rerender pipeline.

NVIDIA’s July 2025 NeMo Retriever research on PDF extraction showed that specialized OCR-based pipelines outperform general-purpose vision-language models on retrieval recall and throughput for complex elements including tables, charts, and infographics. VLMs produce plausible-looking output on clean documents but degrade on exactly the edge cases (multi-column scans, mixed-content pages, annotated overlays) that a specialized pipeline handles systematically.

The Full Object Map: All 12 Extractable PDF Element Types

The Structural Extraction schema v1.0.7 defines twelve element types in the type enum: titleheadparagraphtableimageheaderFooterformhyperlinkfootnotesidebarannotation, and formula.

The API exposes no per-object filter parameters. The only request body fields are documentId (required) and password (optional, for protected PDFs). The engine extracts the full element graph and returns everything in one asynchronous round-trip. You filter client-side on the returned JSON. The design is correct for the workload because partial extraction would require re-running layout recognition per request, costing more compute than transmitting the full element set in a single ZIP.

The result is a ZIP archive. At minimum it contains StructureInfo.json, whose top-level analyzeResult object holds versionpageselements, and info. Documents that contain figures or tables also produce additional binary files (image renditions and table renditions) alongside the JSON, referenced from individual elements so the JSON payload stays manageable on large documents.

Each element in the document-wide flat elements array carries its own idtypecontentregion (with page and an 8-point boundingBox polygon), and score confidence value. A table element adds its cell grid. A form element adds field data. An image element points to its binary file in the ZIP. Because titlehead, and paragraph elements appear in document reading order in the elements array, they chunk cleanly on semantically correct boundaries, which is what a RAG index needs to return complete, coherent passages.

Each type maps directly to a downstream use case: table feeds financial reporting pipelines, form drives automated CRM data entry, image routes to computer vision workflows or document archives, annotation builds compliance audit trails, and head combined with paragraph elements in reading order feeds RAG ingestion.

API Walkthrough: The Four-Step Async PDF Extraction Flow

There’s no synchronous path. You upload, get a task ID, poll until completion, then download the result ZIP. Every request carries two headers: client_id and client_secret (lowercase snake_case, as specified in the API spec’s security schemes). Both come from the Developer Portal’s default application. Pass them as named HTTP headers on every request and do not use Authorization: Bearer.

The four-step sequence runs as follows:

Four-step PDF structural extraction API flow between client and Foxit PDF Services. 

The four-step sequence diagram uses two headers on every request: client_id and client_secret. Create a free developer account at account.foxit.com/site/sign-up (no credit card required, no sales call). Once you’re in, the credentials live under the default application in the Developer Portal. Copy the Client ID and Client Secret pair and treat them like any other API secret. Pass them as named HTTP headers on every call (lowercase snake_case, not Authorization: Bearer).

  • Step 1: Upload the PDF to POST /pdf-services/api/documents/upload as multipart/form-data with the file under field name file. The 100MB ceiling is enforced with a 413 and error code MAX_UPLOAD_SIZE_EXCEEDED. The response body returns { "documentId": "doc_abc123" }.

  • Step 2: Starts extraction with POST /pdf-services/api/documents/pdf-structural-extract, passing { "documentId": "doc_abc123" }. Add a "password" field for protected PDFs. The response is 202 Accepted with { "taskId": "task_xyz789" }.

  • Step 3: Polls GET /pdf-services/api/tasks/{task-id}. The TaskResponse carries taskIdstatusprogress (0-100 integer), resultDocumentId, and an optional error object. The status enum values are PENDINGIN_PROGRESSCOMPLETED, and FAILED. Portal narrative copy occasionally uses “PROCESSING,” but the schema enum value is IN_PROGRESS. Match your code against the enum. Poll until COMPLETED and capture resultDocumentId.

  • Step 4: Downloads with GET /pdf-services/api/documents/{resultDocumentId}/download, which streams the ZIP archive. The optional filename query parameter overrides the default filename.

The complete cURL sequence for all four steps: 

# Step 1: Upload
curl -X POST "https://na1.fusion.foxit.com/pdf-services/api/documents/upload" \
  -H "client_id: YOUR_CLIENT_ID" \
  -H "client_secret: YOUR_CLIENT_SECRET" \
  -F "file=@invoice_batch.pdf"

# {"documentId":"doc_abc123"}

# Step 2: Start extraction
curl -X POST "https://na1.fusion.foxit.com/pdf-services/api/documents/pdf-structural-extract" \
  -H "client_id: YOUR_CLIENT_ID" \
  -H "client_secret: YOUR_CLIENT_SECRET" \
  -H "Content-Type: application/json" \
  -d '{"documentId":"doc_abc123"}'

# 202 Accepted: {"taskId":"task_xyz789"}

# Step 3: Poll task status
curl "https://na1.fusion.foxit.com/pdf-services/api/tasks/task_xyz789" \
  -H "client_id: YOUR_CLIENT_ID" \
  -H "client_secret: YOUR_CLIENT_SECRET"

# {"taskId":"task_xyz789","status":"COMPLETED","progress":100,"resultDocumentId":"result_def456"}

# Step 4: Download the result ZIP
curl "https://na1.fusion.foxit.com/pdf-services/api/documents/result_def456/download" \
  -H "client_id: YOUR_CLIENT_ID" \
  -H "client_secret: YOUR_CLIENT_SECRET" \
  -o extraction_result.zip

The Python version with a polling loop and ZIP parsing:

import requests, json, time, zipfile
BASE_URL = "https://na1.fusion.foxit.com/pdf-services/api"
HEADERS  = {"client_id": "YOUR_CLIENT_ID", "client_secret": "YOUR_CLIENT_SECRET"}

# Step 1: Upload
with open("invoice_batch.pdf", "rb") as f:
    doc_id = requests.post(
        f"{BASE_URL}/documents/upload", headers=HEADERS, files={"file": f}
    ).json()["documentId"]

# Step 2: Start extraction
task_id = requests.post(
    f"{BASE_URL}/documents/pdf-structural-extract",
    headers={**HEADERS, "Content-Type": "application/json"},
    json={"documentId": doc_id},
).json()["taskId"]

# Step 3: Poll until COMPLETED or FAILED
while True:
    task = requests.get(f"{BASE_URL}/tasks/{task_id}", headers=HEADERS).json()
    if task["status"] == "COMPLETED":
        result_doc_id = task["resultDocumentId"]
        break
    if task["status"] == "FAILED":
        raise RuntimeError(f"Extraction failed: {task.get('error')}")
    time.sleep(2)

# Step 4: Download the result ZIP and save it locally for inspection,
# then parse StructureInfo.json from the saved file
response = requests.get(
    f"{BASE_URL}/documents/{result_doc_id}/download", headers=HEADERS
)
with open("advanced-extraction-result.zip", "wb") as f:
    f.write(response.content)

with zipfile.ZipFile("advanced-extraction-result.zip") as zf:
    json_name = next(n for n in zf.namelist() if n.endswith("StructureInfo.json"))
    result = json.loads(zf.read(json_name))["analyzeResult"]

print(f"Schema: {result['version']['schema']}, Elements: {len(result['elements'])}")

On a clean run you should see output like Schema: 1.0.7, Elements: 9 for a small invoice batch. You’ll also find a fresh advanced-extraction-result.zip next to your script. That ZIP holds the full API response, including StructureInfo.json and any rendered image or table binaries, so you can inspect everything the engine returned and not just the parsed JSON.

First, set up and activate a Python virtual environment in your project folder. The official venv guide covers the exact commands for macOS, Linux, and Windows.

Once the virtualenv is active, the sample only needs one third-party package. Drop this into a requirements.txt next to your script and install it with pip install -r requirements.txt:

requests>=2.31.0

If you’re on macOS, use Homebrew Python (brew install python) rather than the system Python from the Xcode command-line tools. The Xcode build is linked against LibreSSL, which is enough to make a correct sample fail.
The ZIP contains a StructureInfo.json file whose top-level object wraps everything under analyzeResult. Inside that wrapper you get a version object, a pages array, a flat elements array, and an info block with analysis metadata. Each element carries its own idtypecontentregion (with page and an 8-point boundingBox polygon [x1,y1,x2,y2,x3,y3,x4,y4]), and a score confidence value:

{
  "analyzeResult": {
    "version": {
      "schema": "1.0.7",
      "software": "FoxitPDFAnalyzer",
      "model": "idp-analysis"
    },
    "pages": [
      {
        "pageNumber": 1,
        "size": { "width": 612, "height": 792, "unit": "point" },
        "state": "success"
      }
    ],
    "elements": [
      {
        "id": "title1",
        "type": "title",
        "content": {
          "text": "Q3 Revenue Summary",
          "style": {
            "fontName": "Helvetica",
            "fontSize": 24.0,
            "fontWeight": 0,
            "fontItalic": false
          }
        },
        "region": {
          "page": 1,
          "boundingBox": [72, 47, 317, 47, 317, 80, 72, 80]
        },
        "score": 0.76
      }
    ],
    "info": {
      "basicInfo": {
        "softwareVersion": "1.6.0",
        "analyzedPageCount": 1,
        "elementCounts": { "title": 1 }
      },
      "extendedMetadata": {
        "pageCount": 1,
        "isEncrypted": false,
        "hasAcroform": false,
        "language": "en"
      }
    }
  }
}

Elements of type tableimage, and form carry additional type-specific payload on top of this base shape, and any rendered image or table binary lands as a sibling file inside the ZIP referenced from the element.

HTTP errors return a standard error envelope:

{ "code": "VALIDATION_ERROR", "message": "documentId is required" }

The documented error codes include VALIDATION_ERROR (400), MAX_UPLOAD_SIZE_EXCEEDED (413), DOCUMENT_NOT_FOUND (404), STORAGE_ERROR, and INTERNAL_SERVER_ERROR (500).

Password-protected PDFs that arrive with no password parameter reach the processing stage before failing. That failure surfaces in the task status poll response after status reaches FAILED, so your error handler must inspect the task response body in addition to the HTTP status codes from the initial POST calls:

{
  "taskId": "task_xyz789",
  "status": "FAILED",
  "progress": 0,
  "error": {
    "code": "INTERNAL_SERVER_ERROR",
    "message": "Document is password-protected"
  }
}

Wiring Extracted PDF Data Into Your Workflow

Pattern 1: AI/RAG pipeline. Filter the flat elements array to titlehead, and paragraph types. Chunk by heading hierarchy, iterating over the array in the order the engine returned it (document reading order is preserved across columns and pages). Embed each chunk and index in Pineconepgvector, or your vector store of choice. Correct reading order, as provided by the extraction engine, is the prerequisite for accurate RAG retrieval on multi-column and paginated documents. When chunks split mid-thought because a layout detector merged two columns, retrieval recall drops and answer quality follows.

Pattern 2: BI reporting. Filter elements by type == "table" client-side, then convert each table’s cell structure into a pandas DataFrame:

import pandas as pd

# `result` is the `analyzeResult` object loaded from StructureInfo.json
tables = [e for e in result["elements"] if e["type"] == "table"]

for i, tbl in enumerate(tables):
    # Cells live at content.body.cells[]. Each cell carries rowIndex,
    # columnIndex, and a nested paragraph whose content.text holds the value.
    body = tbl["content"]["body"]
    grid = [["" for _ in range(body["columnCount"])] for _ in range(body["rowCount"])]
    for cell in body.get("cells", []):
        text = cell.get("paragraph", {}).get("content", {}).get("text", "")
        grid[cell["rowIndex"]][cell["columnIndex"]] = text
    df = pd.DataFrame(grid[1:], columns=grid[0])  # first row as header
    print(f"Table {i}: {df.shape[0]} rows x {df.shape[1]} cols")
    # df.to_gbq("finance.q3_revenue", project_id="your-project")  # BigQuery
    # df.to_sql("q3_revenue", engine)                             # Postgres / Snowflake

The row and column indices from the extraction schema map directly to DataFrame positions, so you get a correctly-structured table with zero manual parsing.

Pattern 3: n8n automation. The four-step flow maps to a chain of HTTP Request nodes in n8n. The first node uploads to POST .../upload and passes documentId through the item. The second sends POST .../pdf-structural-extract and captures taskId. A Loop Over Items construct with an HTTP Request node calling GET .../tasks/{taskId} on a two-second interval checks status until COMPLETED, then routes to the download node. The final HTTP Request node calls GET .../documents/{resultDocumentId}/download, and a Code node using n8n’s binary data helpers unpacks the ZIP and parses the JSON for routing to a Salesforce, HubSpot, Postgres, or Airtable node. The polling requirement makes this a multi-node workflow, but you write zero custom glue code and gain n8n’s built-in error routing and retry handling.

PDF Extraction Tools Compared: Foxit vs. Adobe, Google, Amazon, and Azure

ToolUnderlying ApproachEcosystem Lock-inHandles Scanned PDFsPricing ModelSetup OverheadStatus
Foxit Structural ExtractionProprietary OCR + layout recognition + AI (integrated core engine)Cloud-agnostic REST APIYes (dedicated OCR layer)Subscription, no per-page creditsLow (2 credential headers, 4 REST calls)Trial (schema v1.0.7)
Adobe PDF Extract APIAdobe Sensei ML, reading order + renditionsAdobe Document ServicesYesContact salesMedium (Adobe SDK + ecosystem)GA
Google Document AICloud ML + generative AI, Document Object ModelGoogle Cloud requiredYesPer-page pay-as-you-goMedium-high (GCP + IAM)GA
Amazon TextractDeep learning OCR, key-value and table extractionAWS-nativePartial (strong on forms, weaker on complex layouts)Per-page pay-as-you-goMedium (AWS + IAM)GA
Azure Document IntelligencePrebuilt + custom ML modelsAzure ecosystemYes (prebuilt models)Per-page + model training costsHigh for custom modelsGA

Google Document AI and Azure Document Intelligence win on ecosystem integration if you’re all-in on those clouds. Adobe wins on PDF structural fidelity for workflows already inside the Adobe Document Services ecosystem. Amazon Textract excels on standardized form documents where its pre-trained schema fits the input. These are real advantages, and the comparison is honest only when those contexts are acknowledged.

Foxit’s case is strongest when you need a cloud-agnostic REST API with zero ecosystem dependency, full object coverage across all twelve element types, and enterprise throughput (10 to 10,000+ PDFs/day) with SOC 2, GDPR, and HIPAA compliance built in. The Structural Extraction status is a real trade-off to factor in. The schema at v1.0.7 is callable and stable enough for pipeline integration today, but GA competitors carry a finalized contract. Pin your parser to the version field in the response and you’re insulated from schema evolution.

Your First PDF Extraction API Call, Right Now

Go to developer-api.foxit.com, create a free developer account (no credit card required), and copy your Client ID and Client Secret from the default application. Use the built-in API Playground or import the Postman collection from the Developer Portal to run the four-step sequence: upload a real document (an invoice, a multi-page contract, or a scanned form), call pdf-structural-extract with the returned documentId, poll tasks/{taskId} until COMPLETED, then download via documents/{resultDocumentId}/download.

Unzip the result, open StructureInfo.json, and check three things: analyzeResult.version.schema should report 1.0.7analyzeResult.elements[] should contain at least one table element and one form element if your source document includes those, and the ZIP root should contain the corresponding binary files for any image-type elements. That verification confirms the full extraction pipeline is wired correctly end-to-end.

The same endpoint pattern scales to enterprise volumes. Increase upload and poll concurrency horizontally and the architecture stays identical, with no schema changes, no infrastructure modifications, and no per-page credit consumption to track.

The engineering gap between what basic extraction libraries return and what downstream systems actually consume is where document pipeline hours accumulate. Structural Extraction closes that gap at the API layer, so the complexity stays in the engine and out of your codebase. Get started at developer-api.foxit.com.

PDF Structural Extraction FAQ

PDF structural extraction is the process of identifying and classifying the semantic elements inside a PDF, such as titles, paragraphs, tables, forms, images, and annotations, rather than just pulling raw text. Foxit’s PDF Structural Extraction API returns twelve distinct element types as structured JSON, preserving spatial relationships, reading order, and table cell grids so downstream systems like RAG pipelines, BI dashboards, and CRMs can consume the data without manual parsing.

Yes. Foxit’s PDF Structural Extraction engine includes a dedicated OCR layer that recognizes characters from image-based and scanned PDFs across 200+ languages. The OCR runs on the same internal page representation as the rendering engine, so it handles edge cases like text overlapping image regions, stamped signatures, and engineering drawing annotations that basic libraries like PyMuPDF silently drop.

Foxit’s API is cloud-agnostic with no ecosystem lock-in, requiring just two credential headers and four REST calls. Adobe PDF Extract requires the Adobe Document Services ecosystem, Google Document AI requires GCP and IAM setup, and Amazon Textract requires AWS infrastructure. Foxit also uses subscription-based pricing without per-page credits, while Google, AWS, and Azure all charge per page.

The API identifies twelve element types: title, head, paragraph, table, image, headerFooter, form, hyperlink, footnote, sidebar, annotation, and formula. Each element returns with its content, an 8-point bounding box polygon, page location, and a confidence score. Tables include full cell grids with row and column indices, forms include field data, and images are extracted as separate binary files inside the result ZIP.

The API uses a four-step asynchronous flow: upload the PDF via POST /documents/upload to get a documentId, start extraction with POST /documents/pdf-structural-extract, poll GET /tasks/{taskId} every two seconds until status is COMPLETED, then download the result ZIP via GET /documents/{resultDocumentId}/download. Authentication uses two headers, client_id and client_secret, available from the default application in the Foxit Developer Portal.

The endpoint is currently in Trial status with schema version v1.0.7, meaning the contract is stable but may evolve. It runs on the production base URL at developer-api.foxit.com and is built on Foxit’s core PDF engine, which powers 700 million+ users across 20+ years of deployments. For production pipelines, pin your parser to the version field in the response to insulate against future schema changes.

Foxit MCP Server: Give AI Agents Direct Access to 30+ PDF Tools via Model Context Protocol

Foxit PDF API MCP Server architecture connecting AI agents to 30+ PDF tools, eSign, and DocGen workflows via Model Context Protocol.

Learn how the Foxit MCP Server lets AI agents handle PDF conversion, OCR, merge, signing, and document workflows.

Building a document automation agent with raw REST calls means writing the same boilerplate every time: upload a file, poll for task completion, download the result, handle errors, and manage auth tokens across multiple endpoints. For PDF operations, that loop repeats for every conversion, OCR call, or merge operation in your pipeline. The Foxit PDF API MCP Server collapses those loops into 30+ directly callable tools, with the MCP Server handling upstream REST complexity internally.

This guide covers how the server registers, what it exposes, how Foxit’s eSign and DocGen REST APIs extend the same agent session into signing and document generation workflows, and a concrete four-step workflow you can replicate against your own documents.

MCP Architecture in 90 Seconds

The MCP specification defines three roles. The Host is the LLM runtime (Claude Desktop, VS Code with GitHub Copilot, or Cursor) that manages the conversation and decides when to call tools. The Server is the capability provider, a process that advertises tools over the MCP protocol and executes them against some underlying service. Tools are the individual callable operations each server exposes, defined by a JSON schema the host uses to understand inputs and outputs.

Foxit occupies both sides of this architecture. Foxit PDF Editor ships as an MCP Host, the first PDF application to do so, connecting outward to external MCP servers like Gmail or Salesforce so its AI assistant can reach those services. The Foxit PDF API MCP Server works in the other direction, exposing Foxit’s cloud PDF Services API as 30+ tools for any MCP Host to call.

The MCP Server exposes PDF Services operations covering conversion between formats, content extraction, OCR, merge, split, compress, flatten, linearize, compare, watermark, form data import/export, security, and property inspection. Foxit’s eSign API and DocGen API are separate REST services that are not part of the MCP Server, so they are not exposed as MCP tools. A single agent workflow can still reach them, but through the agent’s own code-execution layer rather than through the MCP protocol, a distinction the eSign section explains in detail. The MCP tools handle PDF processing, while code the agent runs handles signing and template generation.

Flowchart showing how an MCP host such as Claude Desktop, VS Code, or Cursor connects to Foxit services along two paths.

Prerequisites and Configuration

You need three things before registering the server:

Clone the repo from github.com/foxitsoftware/foxit-pdf-api-mcp-server, then register it in your host’s MCP config. The walkthrough below uses Claude Desktop, but the same commandargs, and env values work in any MCP host. In Claude Desktop, open Settings, select the Developer tab, and click Edit Config.

Claude Desktop's Settings on the Developer tab, with the Edit Config button highlighted above the Local MCP servers list.

Then open claude_desktop_config.json with any text edito(stored at ~/Library/Application Support/Claude/ on macOS or %APPDATA%\Claude\ on Windows).

The claude_desktop_config.json file open in a text editor, showing the foxit-pdf server registered under mcpServers with its command, args, and env credentials.

Add the Foxit server under the mcpServers key:

{
  "mcpServers": {
    "foxit-pdf": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/foxit-pdf-api-mcp-server",
        "run",
        "foxit-pdf-api-mcp-server"
      ],
      "env": {
        "FOXIT_CLOUD_API_HOST": "https://na1.fusion.foxit.com/pdf-services",
        "FOXIT_CLOUD_API_CLIENT_ID": "your_client_id",
        "FOXIT_CLOUD_API_CLIENT_SECRET": "your_client_secret"
      }
    }
  }
}

Set FOXIT_CLOUD_API_CLIENT_ID and FOXIT_CLOUD_API_CLIENT_SECRET as environment variables on your system before the host process launches. Passing credentials through prompt context is a security risk your production setup should address. The client_id and client_secret from your developer portal authenticate all MCP tool calls to the PDF Services API. Adding eSign to the same agent session requires its own OAuth2 token exchange (covered in the next section), keeping the two credential scopes isolated.

After saving, completely quit and reopen Claude Desktop so it loads the config and launches the server as a local subprocess over standard input and output, the transport the Foxit server uses.

Claude Desktop's Local MCP servers panel with foxit-pdf selected and marked as running, showing its npx command and arguments.

On restart, you should see the foxit MCP as Running in the local MCP servers in the developer tab. If you go the Customize tab then open the Connectors and click foxit-pdf to see the tools that the Foxit MCC has access to, you should see the 30+ tools registered.

The Connectors settings screen showing foxit-pdf tool permissions, with a scrollable list of tools like upload_document, pdf_from_word, and pdf_to_word.

If the connector never appears, the server failed to launch, and Claude’s logs at ~/Library/Logs/Claude/mcp*.log usually point to the cause, commonly a missing uv binary or a wrong --directory path.

To call a tool, type a natural-language request such as “Convert this Word file to PDF and compress it.” The agent selects pdf_from_word and pdf_compress, and Claude Desktop shows an approval prompt with the exact tool name and arguments before each call runs; the tool’s JSON result then streams back into the conversation.

A Claude Desktop chat converting a Word file to PDF, showing the approval prompt for the pdf_from_word tool from foxit-pdf.

That per-call approval is your audit point, since it surfaces precisely which tool the agent chose and what it passed.

 

If you would rather run the server in VS Code, the equivalent entry goes in .vscode/mcp.json under a top-level servers key, with an added "type": "stdio" field so VS Code launches the server the same way:

{
  "servers": {
    "foxit-pdf": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/foxit-pdf-api-mcp-server",
        "run",
        "foxit-pdf-api-mcp-server"
      ],
      "env": {
        "FOXIT_CLOUD_API_HOST": "https://na1.fusion.foxit.com/pdf-services",
        "FOXIT_CLOUD_API_CLIENT_ID": "your_client_id",
        "FOXIT_CLOUD_API_CLIENT_SECRET": "your_client_secret"
      }
    }
  }
}

You can also run MCP: Add Server from the Command Palette (Cmd+Shift+P or Ctrl+Shift+P), choose Command (stdio), and pick Workspace to write the entry into .vscode/mcp.json or Global to store it in your user profile. Once saved, VS Code shows inline StartStop, and Restart actions above the server entry and lists it under the MCP SERVERS – INSTALLED view, where a green indicator and the discovered tool count confirm the connection.

PDF Services MCP Tools: Full Catalog

The 30+ tools organize into seven functional categories. Most tools expect a documentId returned by a prior upload_document call, and return a resultDocumentId you pass to download_document when you want the output locally. The exception is pdf_from_url, which accepts a URL directly.

Document Lifecycle

  • upload_document: upload a PDF, Office file, image, HTML file, or plain text file; returns a documentId for subsequent operations
  • download_document: retrieve a processed result to a local file path
  • delete_document: clean up stored files from cloud storage

PDF Creation (file to PDF)

  • pdf_from_wordpdf_from_excelpdf_from_ppt: convert Office documents to PDF
  • pdf_from_textpdf_from_imagepdf_from_html: convert plaintext, image files, or HTML to PDF
  • pdf_from_url: fetch a live URL and convert the rendered page to PDF

PDF Conversion (PDF to file)

  • pdf_to_wordpdf_to_excelpdf_to_ppt: extract editable Office formats from a PDF
  • pdf_to_textpdf_to_htmlpdf_to_image: export text, HTML, or image representations

Manipulation

  • pdf_merge: combine multiple PDFs into one
  • pdf_split: split by page ranges, page count, or every page individually
  • pdf_extract: pull a subset of pages from a PDF
  • pdf_compress: reduce file size by 30-70% depending on content type
  • pdf_flatten: convert form fields and annotations to static content (required for compliance archiving workflows)
  • pdf_linearize: optimize for Fast Web View so browsers can stream PDF pages incrementally
  • pdf_watermark: apply text or image watermarks with configurable position, opacity, and rotation
  • pdf_manipulate: rotate, delete, or reorder pages

Analysis

  • pdf_compare: diff two PDFs and return a color-coded annotation document showing changes
  • pdf_ocr: convert scanned or image-based PDFs to searchable text with multi-language support
  • pdf_structural_analysis: detect document structure (titles, headings, paragraphs, tables with cell grids, images, form fields, hyperlinks, and metadata) with bounding boxes, following the Foxit PDF structural extraction engine schema. The result is JSON packaged inside a downloadable ZIP, not a return of named business entities; it reports layout and structure, and turning that into fields like party names is the job of the agent’s LLM, which performs the semantic extraction over that JSON

Security and Forms

  • pdf_protect: add password protection with 128-bit or 256-bit AES encryption and granular permission flags
  • pdf_remove_password: strip password protection from a document
  • export_pdf_form_data: extract form field values as JSON
  • import_pdf_form_data: populate form fields from a JSON payload

Properties

  • get_pdf_properties: return page count, page dimensions, PDF version, encryption status, digital signature info, embedded files, font inventory, and document metadata

The most-used operation in production document pipelines is pdf_from_word. Your agent uploads a DOCX file, gets back a documentId, then calls pdf_from_word with that ID. The underlying PDF Services API runs the conversion asynchronously, but the MCP Server handles polling internally and delivers the final result directly to your agent.

MCP tool call:

{
  "name": "pdf_from_word",
  "input": {
    "documentId": "doc_abc123"
  }
}

MCP tool response:

{
  "success": true,
  "taskId": "task_xyz789",
  "resultDocumentId": "doc_result456",
  "message": "Word document converted to PDF successfully. Download using documentId: doc_result456"
}

Pass doc_result456 to download_document to write the output PDF to disk, or feed it directly into another tool call like pdf_structural_analysis or pdf_compress as the next step in a chain.

Extending to eSign: Foxit’s Signing API as a Complementary REST Layer

After PDF processing via MCP tools, the next stage of the workflow dispatches a document for signature through Foxit’s eSign REST API, which lives at https://na1.foxitesign.foxit.com. This guide uses the na1 (US) region throughout.

Foxit also operates regional eSign hosts for the EU (eu1.foxitesign.foxit.com), Canada (na2.foxitesign.foxit.com), and Australia (au1.foxitesign.foxit.com). The endpoints and payloads are identical; only the host changes, so pick the host that matches your data residency requirements.

The eSign API is not part of the Foxit MCP Server, so it is not an MCP tool, and that distinction matters for how the agent reaches it. Most MCP hosts cannot make arbitrary HTTP calls on their own, so the agent does not reach eSign “through MCP.” Instead, the agent invokes eSign from its own code-execution layer, whether that is a code interpreter the host provides, an agent framework that runs Python, or a custom tool you register that wraps the eSign calls. The cleanest production pattern is to wrap the eSign operations you need as custom MCP tools so the host calls them the same way it calls the PDF tools; the production considerations section returns to this. The code below is what that layer runs.

Authentication uses OAuth2 client_credentials. The eSign token exchange is a distinct flow from the PDF Services header auth that backs your MCP tools:

import requests

resp = requests.post(
    "https://na1.foxitesign.foxit.com/api/oauth2/access_token",
    data={
        "client_id": ESIGN_CLIENT_ID,
        "client_secret": ESIGN_CLIENT_SECRET,
        "grant_type": "client_credentials",
        "scope": "read-write"
    }
)
access_token = resp.json()["access_token"]

The Foxit eSign API developer guide uses “folder” terminology throughout. The key endpoints in an automated signing flow are:

  • POST /api/folders/createfolder: create a signing folder from one or more PDF documents, with signers, subject, and message
  • POST /api/folders/sendDraftFolder: dispatch a draft folder to its signers
  • POST /api/templates/createtemplate: save a reusable template from a PDF with pre-placed signature fields (instantiate a folder from it later via POST /api/templates/createFolder)
  • GET /api/folders/viewActivityHistory?folderId={id}: retrieve the activity audit trail for a folder once it has been sent (a draft that has never been shared returns an error)
  • Webhook channels for status callbacks: register a callback URL to receive real-time events when signers view, sign, or decline

createfolder call takes the PDF output from your MCP pipeline, uploaded to eSign’s document storage after download_document retrieves it, and sets up the signing workflow:

POST /api/folders/createfolder
Authorization: Bearer {access_token}
Content-Type: application/json
{
  "folderName": "Acme Corp Contract - Q3 2025",
  "sendNow": false,
  "fileUrls": ["https://your-storage.example.com/acme_contract_final.pdf"],
  "fileNames": ["acme_contract_final.pdf"],
  "parties": [
    {
      "firstName": "John",
      "lastName": "Smith",
      "emailId": "[email protected]",
      "permission": "FILL_FIELDS_AND_SIGN",
      "sequence": 1
    }
  ]
}

Set sendNow to false to create a draft folder, then dispatch it with a separate call to /api/folders/sendDraftFolder. Alternatively, set sendNow to true to create and send in a single call. For files not accessible via URL, add "inputType": "base64" and pass the documents as a base64FileString array instead of fileUrls; omitting inputType makes the API reject the base64 payload as empty.

Foxit’s eSign API ships with HIPAAeIDASESIGN ActUETA21 CFR Part 11, FERPA, and FINRA compliance built in. Audit trail records carry signer location, IP address, recipient identity, event timestamp, consent confirmation, security level, and complete folder history. For legal defensibility in regulated industries, capture and store these fields in your own data layer, because relying solely on Foxit’s folder history API for compliance record-keeping introduces a single point of failure in your audit chain.

End-to-End Workflow: AI Agent Automates a Sales Contract

Picture a sales ops agent that starts from a single natural language goal, “Generate a contract for Acme Corp, $48,000 ARR, and send it to [email protected] for signature.” Nothing about the tool sequence is hard-coded. The MCP Server advertises its PDF tools to the host on connection, so the agent can read the goal, recognize that it has a template to render and a document to route for signature, and decide which operations to call and in what order. The PDF steps run as MCP tool calls; the DocGen and eSign steps run from the agent’s code layer. The sequence below is one plausible run the agent might choose, not a fixed script you wire up in advance.

Sequence diagram showing an AI agent automating a sales contract across the Foxit MCP Server, DocGen REST API, and eSign REST API. The agent uploads a DOCX through the MCP server, converts it to PDF with pdf_from_word, runs pdf_structural_analysis, and downloads the resulting ZIP to read the structure and extract fields.

To get a PDF to work with, the agent first reaches for MCP tools. It calls upload_document with the DOCX contract template, receives documentId: "doc_abc", and calls pdf_from_word. The MCP Server handles the async conversion internally and returns resultDocumentId: "doc_pdf" once it completes.

Needing to know what is inside that PDF, the agent calls pdf_structural_analysis with documentId: "doc_pdf". The tool does not hand back named entities like “party” or “ARR.” It returns a resultDocumentId pointing to a ZIP archive, so the agent calls download_document to retrieve it, unzips it, and reads the structural JSON, which describes headings, paragraphs, and table cells with their positions. The agent’s LLM is what performs the semantic extraction: it reads the structural JSON and pulls “Acme Corp” out of a heading or a contract value out of a table cell, confirming the fields it needs are present. The tool hands back structure; the model turns that structure into meaning. If you want the API to return business entities directly rather than leaning on the model to interpret layout, that is the job of Foxit’s iDox.ai Document API, a separate service built for entity and PII extraction.

With the field values in hand, the agent generates the finished contract through the DocGen API, posting to /document-generation/api/GenerateDocumentBase64 with the values merged into the template via {{dynamic_tags}} syntax. DocGen is synchronous, so the call returns the finalized PDF in the response body, with Acme Corp’s name, the $48,000 ARR figure, and the correct dates populated. No polling step is involved.

Finally, the agent routes the document for signature. It authenticates against the eSign OAuth2 endpoint, uploads the DocGen output, creates a signing folder via /api/folders/createfolder with [email protected] as the signer, and dispatches it via /api/folders/sendDraftFolder.

What ties this together is that the model decides the order from the goal, not a script. The PDF steps resolve to MCP tool calls the host already knows about; the DocGen and eSign steps run through the agent’s code layer, since those APIs are not MCP tools. The agent chains the output of one step into the input of the next, and the only orchestration you maintain is whatever exposes that code layer to the model, ideally a set of custom tools rather than ad hoc scripting.

Production Considerations: Error Handling, Rate Limits, and Data Governance

When you call PDF Services through the MCP Server, async polling happens inside the server process. Your agent receives a final resultDocumentId only after the task completes. When you call the raw PDF Services REST API directly, every operation returns a taskId you poll manually. The pattern below applies exponential backoff with a ceiling of 10 seconds per interval and a 30-second total timeout:

import time, requests

API_HOST = "https://na1.fusion.foxit.com/pdf-services"
auth_headers = {
    "client_id": "your_client_id",
    "client_secret": "your_client_secret"
}

def poll_task(task_id: str, max_wait: int = 30) -> str:
    delay = 1
    elapsed = 0
    while elapsed < max_wait:
        resp = requests.get(
            f"{API_HOST}/api/tasks/{task_id}",
            headers=auth_headers
        )
        data = resp.json()
        if data["status"] == "COMPLETED":
            return data["resultDocumentId"]
        time.sleep(delay)
        elapsed += delay
        delay = min(delay * 2, 10)
    raise TimeoutError(f"Task {task_id} timed out after {max_wait}s")

Because eSign and DocGen are not MCP tools, decide deliberately how the agent reaches them. Letting the model emit raw HTTP from a free-form code interpreter is brittle and hard to audit. The more durable pattern is to wrap the specific eSign and DocGen operations you use, such as create-folder, send-folder, and generate-document, as custom MCP tools with typed inputs. The host then calls them through the same protocol it uses for the PDF tools, the credentials stay in the tool process rather than in the prompt, and the agent’s choices become inspectable tool calls instead of opaque scripts.

The output of pdf_structural_analysis deserves its own caution. The structural JSON for a long contract can run to many thousands of elements, and feeding the entire file into the model can quietly blow past its context window, which tends to surface as a truncated or confused extraction rather than a clean error. Have the code that unzips the archive filter the JSON before the model sees it, keeping only the element types and pages that matter (for a contract, usually the heading blocks and the relevant table), rather than passing the whole document through.

The free developer plan at developer-api.foxit.com covers development and testing volumes. Production workloads above the free-tier threshold require a volume plan requested through the Developer Portal.

For data governance, all API traffic runs over TLS 1.2+, and documents at rest use AES-256 encryption. Foxit’s API security documentation covers SOC 2 Type II audit status, HIPAA BAA support, GDPR, CCPA, eIDAS, ESIGN Act, UETA, 21 CFR Part 11, FERPA, and FINRA requirements. Customer data runs in logically segmented environments. For healthcare, legal, or financial services pipelines, confirm your data residency requirements before connecting production document flows, then choose the matching regional eSign host noted earlier, since the host you call determines where data is processed.

PDF API MCP Server FAQs

The Foxit PDF API MCP Server is an open-source Model Context Protocol server that exposes Foxit’s cloud PDF Services API as 30+ callable tools. Any MCP-compatible AI agent host, including Claude Desktop, VS Code with GitHub Copilot, and Cursor, can invoke these tools directly.

The server supports conversion (Word, Excel, PowerPoint, image, HTML, and URL to PDF and back), OCR, merge, split, extract, compress, flatten, linearize, watermark, compare, form data import/export, password protection, and full document property inspection across seven functional tool categories.

PDF Services tools authenticate via a client_id and client_secret set as environment variables before the MCP host launches. The eSign API uses a separate OAuth2 client_credentials token exchange against https://na1.foxitesign.foxit.com/api/oauth2/access_token. The two credential scopes are isolated by design.

Yes. The server registers using a standard mcp.json config block for VS Code with GitHub Copilot or a claude_desktop_config.json block for Claude Desktop. The same config structure works for Cursor. All three hosts discover the server’s tools automatically on connection.

The Foxit developer account is free with no credit card required and covers development and testing volumes. Production workloads above the free-tier threshold require a volume plan through the Developer Portal.

Run Your First Tool Call Now

Getting a working MCP tool call takes under 15 minutes:

  1. Create a free developer account at developer-api.foxit.com (no credit card, instant access). Copy your client_id and client_secret from the dashboard.

  2. Set the three environment variables:

export FOXIT_CLOUD_API_HOST="https://na1.fusion.foxit.com/pdf-services"
export FOXIT_CLOUD_API_CLIENT_ID="your_client_id"
export FOXIT_CLOUD_API_CLIENT_SECRET="your_client_secret"
  1. Clone the repo, register it using the config block from the Prerequisites section, restart your MCP host, and invoke pdf_from_url with any public URL. You’ll have a confirmed PDF output in your working directory. The Developer Portal also includes a live API Playground for validating request payloads against the PDF Services API before wiring them into an agent.

For a full signing workflow, the minimum viable addition to the MCP setup is authenticating against the eSign OAuth2 endpoint and posting to /api/folders/createfolder with a static PDF. DocGen field population, pdf_structural_analysis extraction, and webhook callbacks extend the same pattern incrementally from there.

Get your free API access at developer-api.foxit.com.

Automate Dynamic PDF Generation with the Foxit DocGen API: Word Templates, JSON Data, and Real API Calls

Foxit DocGen API workflow showing a Word template with data tags being converted into a PDF document using JSON data.

Skip the HTML-to-PDF headaches. Use Foxit’s DocGen API to turn Word templates and JSON data into clean, formatted PDFs with one API call.

If you’ve tried to generate a contract or invoice from HTML, you’ve probably burned hours on page-break-inside: avoid declarations that Chrome renders one way and a headless browser renders another. Headers and footers require separate print-media queries, and by the time you’ve got a repeating table header working correctly across pages, you’ve invested a full day of engineering into CSS that exists solely to trick a browser into behaving like a printer.

HTML documents reflow content into a viewport while PDF documents have fixed page geometry. Forcing one model into the other produces predictable failure modes: footnotes that collide with page footers, tables that split at the worst possible row, custom fonts that substitute silently, and signature blocks that drift off-page on longer documents.

There’s a larger practical cost too. For most teams, the authoritative source for enterprise document templates is already a Word file. Your legal team owns the NDA in .docx format. Finance owns the invoice in .docx format. Every structural change flows through Word because that’s where the tracked changes, formatting history, and review process live. Maintaining a parallel HTML version of each template doubles your maintenance surface from day one.

Foxit’s DocGen API eliminates that parallel entirely. You keep your templates as .docx files, embed data tags directly in Word, POST the base64-encoded template and a JSON payload to a single REST endpoint, and receive the rendered PDF (or DOCX) in the response body. You eliminate the browser rendering engine, the print-media CSS layer, and the overhead of a second template format.

How the Foxit DocGen API Works

The core model is a single synchronous POST to the GenerateDocumentBase64 endpoint at developer-api.foxit.com. Your request body carries three fields:

  • base64FileString: your .docx template, base64-encoded
  • documentValues: a JSON object containing your merge data
  • outputFormat: either "pdf" or "docx"

The API processes the template, resolves every tag against your data, and returns a JSON response containing base64FileString (the rendered document) and a message field confirming success or describing a failure. The exchange is fully synchronous, so you receive the finished document in the same HTTP response with no job ID to poll and no webhook to configure.

Authentication uses two HTTP headers: client_id and client_secret. Both come from the Foxit Developer Portal when you create an account. The free Developer plan provides 500 credits per year with no credit card required, and each GenerateDocumentBase64 call consumes exactly one credit. The Startup plan ($1,750/year) provides 3,500 credits. The Business plan ($4,500/year) covers 150,000 credits for production workloads. For context, Nutrient’s API starts at $75 for 1,000 credits, and Apryse requires a sales conversation before you can access pricing at all.

The complete call flow runs from template file to PDF on disk.

Sequence diagram showing the Foxit DocGen API workflow from reading a Word template and encoding it to base64, sending the POST request, and receiving the rendered PDF response.

You can explore every endpoint in the live API playground at developer-api.foxit.com, and the portal includes a Postman collection you can import to run authenticated requests without writing a line of code first.

Build a Word Template with DocGen Tags

Open any .docx file in Microsoft Word and type your tags as plain text directly in the document. The DocGen API uses double-brace syntax: {{field_name}}. Tags go anywhere Word accepts text: headings, body paragraphs, table cells, headers, footers, or text boxes.

Scalar field tags resolve directly to the matching key from your documentValues JSON. A document header with {{customer_name}}{{invoice_number}}, and {{invoice_date}} pulls those three values straight from the top-level keys of your payload.

For arrays, you wrap a single table row (the data row, not the header row) with {{TableStart:array_name}} and {{TableEnd:array_name}} markers. The wrapped row acts as a template row, and the API renders one output row per item in the JSON array. An invoice line-items table in Word looks like this:

DescriptionQtyUnit PriceTotal
{{TableStart:line_items}}{{description}}{{qty}}{{unit_price}}{{total}}{{TableEnd:line_items}}

Within the array row, ROW_NUMBER auto-increments with each rendered row. A SUM(ABOVE) field placed in the row directly below the {{TableEnd:line_items}} marker calculates a column total across all rendered data rows.

For nested JSON objects, use dot-notation in your tags. A shipping address block references {{shipping.street}}{{shipping.city}}, and {{shipping.postal_code}}, mapping to properties nested inside a shipping object in your payload. The nesting can go multiple levels deep, so {{customer.address.city}} resolves against documentValues.customer.address.city.

For a working starting point, grab the downloadable invoice template from the foxit-demo-templates repo. The file is well under the 4 MB upload limit and demonstrates every pattern this article uses: scalar tags, {{TableStart:line_items}} / {{TableEnd:line_items}} with {{ROW_NUMBER}}, currency and date format switches, and subtotal / tax / total fields below the line-items table.

One sizing constraint applies while you build your own template. DocGen rejects uploads larger than 4 MB, so if you embed product photos, scanned letterhead, or full font subsets, compress the images before saving, drop embedded fonts where you can rely on system fonts, or split a large template into smaller per-section templates that you generate and merge separately.

Make Your First API Call: Generate a PDF from JSON

Run a quick pre-flight check before the first call to catch the issues that derail most clean-account run-throughs:

  • Account created and client_id / client_secret copied from the Developer Portal API Keys section
  • Sample template saved locally as invoice_template.docx in the directory you’ll run the script from
  • Template file size confirmed under 4 MB (ls -lh invoice_template.docx on macOS or Linux, right-click → Properties on Windows)

With those in place, confirm your credentials work with a cURL call. The Foxit Developer Portal includes a Postman collection for this, but a quick cURL request against the API catches auth issues before any code runs:

curl -X POST "https://na1.fusion.foxit.com/document-generation/api/GenerateDocumentBase64" \
  -H "client_id: YOUR_CLIENT_ID" \
  -H "client_secret: YOUR_CLIENT_SECRET" \
  -H "Content-Type: application/json" \
  -d '{"base64FileString":"","documentValues":{},"outputFormat":"pdf"}'

A 401 here means invalid credentials. A 400 with a message about the template confirms your headers are accepted and you can proceed to the full call.

Save your .docx template as invoice_template.docx in the same directory as this script, then run the complete generation:

import requests
import base64

CLIENT_ID = "your_client_id"
CLIENT_SECRET = "your_client_secret"
API_URL = "https://na1.fusion.foxit.com/document-generation/api/GenerateDocumentBase64"

# Read and encode the template
with open("invoice_template.docx", "rb") as f:
    template_b64 = base64.b64encode(f.read()).decode("utf-8")

# Build the data payload
document_values = {
    "customer_name": "Acme Corporation",
    "invoice_number": "INV-2025-0042",
    "invoice_date": "07/15/2025",
    "due_date": "08/14/2025",
    "line_items": [
        {
            "description": "API Integration Consulting",
            "qty": 8,
            "unit_price": 195.00,
            "total": 1560.00
        },
        {
            "description": "Document Automation Setup",
            "qty": 1,
            "unit_price": 750.00,
            "total": 750.00
        }
    ],
    "subtotal": 2310.00,
    "tax_rate": 0.08,
    "tax_amount": 184.80,
    "total_due": 2494.80
}

# Construct the request body
payload = {
    "base64FileString": template_b64,
    "documentValues": document_values,
    "outputFormat": "pdf"
}

headers = {
    "client_id": CLIENT_ID,
    "client_secret": CLIENT_SECRET,
    "Content-Type": "application/json"
}

response = requests.post(API_URL, json=payload, headers=headers)

if response.status_code == 200:
    result = response.json()
    pdf_bytes = base64.b64decode(result["base64FileString"])
    if pdf_bytes[:5] != b"%PDF-":
        raise ValueError("Response did not contain a valid PDF")
    with open("invoice_output.pdf", "wb") as out:
        out.write(pdf_bytes)
    print("PDF written to invoice_output.pdf")
else:
    print(f"Error {response.status_code}: {response.json().get('message')}")

The success response is a JSON object with three keys: base64FileString (the rendered PDF, base64-encoded), fileExtension ("pdf"), and message ("PDF Document Generated Successfully"). Decoding and writing the bytes to disk gives you a complete, formatted PDF with every tag replaced by its corresponding data value. If you omit a key from documentValues, the API renders the corresponding tag as an empty string, producing a blank field in the output.

Advanced Data Scenarios: Arrays, Nested Objects, and Built-In Functions

The two-row invoice above works, but most production documents have more complex data shapes. Three patterns cover the majority of real-world cases.

For multi-row tables, the line_items array in the Python snippet above already shows the basic structure. To generate five rows, pass five objects in the array. The Word template row tagged with {{TableStart:line_items}} and {{TableEnd:line_items}} repeats exactly once per array item:

{
  "line_items": [
    {
      "description": "UX Design Review",
      "qty": 4,
      "unit_price": 150.0,
      "total": 600.0
    },
    {
      "description": "Backend API Development",
      "qty": 12,
      "unit_price": 185.0,
      "total": 2220.0
    },
    {
      "description": "Database Schema Migration",
      "qty": 3,
      "unit_price": 200.0,
      "total": 600.0
    },
    {
      "description": "QA Testing",
      "qty": 6,
      "unit_price": 95.0,
      "total": 570.0
    },
    {
      "description": "Deployment and Documentation",
      "qty": 2,
      "unit_price": 175.0,
      "total": 350.0
    }
  ]
}

The API generates exactly five table rows. Swap in 50 items and you get 50 rows, with page breaks handled by Word’s native pagination logic.

For nested objects, the DocGen API resolves dot-notation paths against the full depth of your JSON structure. A shipping confirmation template referencing {{customer.address.city}} works against this payload without any flattening on your end:

{
  "customer": {
    "name": "Sarah Chen",
    "email": "[email protected]",
    "address": {
      "street": "742 Evergreen Terrace",
      "city": "Portland",
      "state": "OR",
      "postal_code": "97201"
    }
  }
}

In the Word template, {{customer.name}}{{customer.address.city}}, and {{customer.address.postal_code}} each resolve to the correct nested value. You can reference the same nested object from multiple locations in the template, and the API populates each instance independently.

For numeric and date formatting, the DocGen API respects Word’s native field switch syntax. Adding \# Currency to a tag formats a numeric value as a currency string, so {{unit_price \# Currency}} renders 195.00 as \$195.00. Date fields accept \@ "MM/dd/yyyy" to control output format, so {{invoice_date \@ "MM/dd/yyyy"}} formats an ISO date string to 07/15/2025. To auto-calculate a column total, place a SUM(ABOVE) field in the Word table row immediately below {{TableEnd:line_items}} and the API evaluates it against the rendered data rows.

Error Handling and Production Readiness

The DocGen API returns a focused set of HTTP status codes. A 200 confirms successful generation. A 401 means your client_id or client_secret headers are invalid, and the fix is to re-copy the credentials from the Developer Portal. A 400 covers three cases. The first is a malformed request body, for example a missing base64FileString or outputFormat. The second is structural issues with the template itself, such as a {{TableStart}} marker placed outside its table row. The third is an oversize template; DocGen rejects .docx uploads larger than 4 MB, and the fix is to compress embedded images, drop embedded fonts, or split the template before re-encoding. The message field in every non-200 response body gives you the specific reason, so log it rather than discarding the response object.

A production wrapper handles all three cases and adds exponential backoff for transient server errors:

import requests
import base64
import time

def generate_document(client_id, client_secret, template_path,
                      document_values, output_format="pdf"):
    API_URL = "https://na1.fusion.foxit.com/document-generation/api/GenerateDocumentBase64"

    with open(template_path, "rb") as f:
        template_b64 = base64.b64encode(f.read()).decode("utf-8")

    payload = {
        "base64FileString": template_b64,
        "documentValues": document_values,
        "outputFormat": output_format
    }
    headers = {
        "client_id": client_id,
        "client_secret": client_secret,
        "Content-Type": "application/json"
    }

    max_retries = 3
    for attempt in range(max_retries):
        try:
            response = requests.post(API_URL, json=payload,
                                     headers=headers, timeout=30)

            if response.status_code == 200:
                return base64.b64decode(response.json()["base64FileString"])

            if response.status_code == 401:
                raise ValueError("Authentication failed: re-check client_id and client_secret")

            if response.status_code == 400:
                msg = response.json().get("message", "Bad request")
                raise ValueError(f"Request error: {msg}")

            if response.status_code >= 500:
                if attempt < max_retries - 1:
                    wait = 2 ** attempt
                    print(f"Server error ({response.status_code}), retrying in {wait}s...")
                    time.sleep(wait)
                    continue
                raise RuntimeError(f"Server error after {max_retries} attempts")

        except requests.exceptions.Timeout:
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)
                continue
            raise

    raise RuntimeError("Max retries exceeded")

The wrapper raises immediately on 4xx responses because retrying a credential error or a malformed request produces the same result. Exponential backoff applies only to 5xx responses and timeouts, where the issue is transient.

Once generate_document() returns raw PDF bytes, routing them downstream takes three lines:

import boto3

s3 = boto3.client("s3")
pdf_bytes = generate_document(CLIENT_ID, CLIENT_SECRET, "invoice_template.docx", document_values)
s3.put_object(Bucket="my-documents-bucket", Key="invoices/INV-2025-0042.pdf", Body=pdf_bytes)

To attach the output to an email, pass pdf_bytes directly as the smtplib attachment payload. To collect a signature on the generated document, base64-encode the bytes and POST them to Foxit’s eSign API with the signer’s email address in the request body. The full eSign API reference is at docs.developer-api.foxit.com.

Common Mistakes

A short list of the issues that account for almost every failed first run.

  • Smart-quote autocorrect on braces. Word’s AutoCorrect can convert the second { of {{ into a curly-quote glyph, which breaks tag parsing silently. Disable “Straight quotes with smart quotes” under AutoCorrect Options, or paste tags as plain text.
  • Token case sensitivity. {{Customer_Name}} and {{customer_name}} are different keys. Match the casing in your JSON exactly.
  • TableStart and TableEnd must sit in the same Word table row. Splitting them across two rows, or placing either marker outside the table, leaves the loop unrendered with no error.
  • Template over 4 MB. The API rejects oversize uploads with a 400. Compress embedded images, drop embedded fonts where system fonts will do, or split the template into smaller pieces.
  • Missing payload key. The API renders an unmatched tag as an empty string rather than failing, so a 200 response does not guarantee every field is populated. Spot-check the rendered PDF as part of any pipeline test.
  • Auth header typos. Headers are client_id and client_secret in snake_case. Client-IdClientId, or X-Client-Id all return 401.

Run the Full Invoice Example End-to-End Right Now

Create a free account directly at account.foxit.com/site/sign-up. This skips the pricing-page redirect you hit from the marketing site and drops you straight into the account form.

  1. Open account.foxit.com/site/sign-up and complete the form (no credit card required).
  2. After verification, sign in to the Developer Portal and the Developer plan (500 credits per year) is active by default.
  3. Open the API Keys section and copy your client_id and client_secret.

With credentials in hand, run the example end-to-end:

  1. Download invoice_full.docx from the foxit-demo-templates repo and save it locally as invoice_template.docx in your working directory. The file is well under the 4 MB upload limit and exercises every tag pattern this article covers.
  2. Paste your credentials into the CLIENT_ID and CLIENT_SECRET variables in the Python script from the previous section.
  3. Edit the document_values dictionary with your own customer name, invoice number, and line items.
  4. Run the script and open invoice_output.pdf.

The free Developer plan’s 500 annual credits cover this tutorial dozens of times over before you spend anything. The full API reference at docs.developer-api.foxit.com covers every endpoint parameter, the complete tag specification, all supported output formats, and the full GenerateDocumentBase64 request and response schema.

Get started with a free account (no credit card required) and generate your first dynamic PDF in under 10 minutes.

Generate Dynamic PDFs from JSON using Foxit APIs

Generate Dynamic PDFs from JSON using Foxit APIs

See how easy it is to generate PDFs from JSON using Foxit’s Document Generation API. With Word as your template engine, you can dynamically build invoices, offer letters, and agreements—no complex setup required. This tutorial walks through the full process in Python and highlights the flexibility of token-based document creation.

Generate Dynamic PDFs from JSON using Foxit APIs

One of the more fascinating APIs in our library is the Document Generation API. This document generation API lets you create dynamic PDFs or Word documents using your own data as templates. That may sound simple – and the code you’re about to see is indeed simple – but the real power lies in how flexible Word can be as a template engine. This API could be used for:

All of this is made available via a simple API and a “token language” you’ll use within Word to create your templates. Whether you’re feeding in data from a database, a form submission, or a JSON API response, the process looks the same from your Python script. Let’s take a look at how this is done.

Credentials

Before we go any further, head over to our developer portal and grab a set of free credentials. This will include a client ID and secret values – you’ll need both to make use of the API.

Don’t want to read all of this? You can also follow along by video:

Using the API

The Document Generation API flow is a bit different from our PDF Services APIs in that the execution is synchronous. You don’t need to upload your document beforehand or download a result. You simply call the API (passing your data and template) and the result has your new PDF (or Word document). With it being this simple, let’s get into the code.

Loading Credentials

My script begins by loading in the credentials and API root host via the environment:

CLIENT_ID = os.environ.get('CLIENT_ID')
CLIENT_SECRET = os.environ.get('CLIENT_SECRET')
HOST = os.environ.get('HOST')

As always, try to avoid hard coding credentials directly into your code.

Calling the API

The endpoint only requires you to pass the output format, your data, and a base64 version of your file. “Your data” can be almost anything you like—though it should start as an object (i.e., a dictionary in Python with key/value pairs). Beneath that, anything goes: strings, numbers, arrays of objects, and so on.

Here’s a Python wrapper showing this in action:

def docGen(doc, data, id, secret):
    
    headers = {
        "client_id":id,
        "client_secret":secret
    }

    body = {
        "outputFormat":"pdf",
        "documentValues": data,  
        "base64FileString":doc
    }

    request = requests.post(f"{HOST}/document-generation/api/GenerateDocumentBase64", json=body, headers=headers)
    return request.json()

And here’s an example calling it:

with open('../../inputfiles/docgen_sample.docx', 'rb') as file:
    bd = file.read()
    b64 = base64.b64encode(bd).decode('utf-8')

data = {
    "name":"Raymond Camden", 
    "food": "sushi",
    "favoriteMovie": "Star Wars",
    "cats": [
        {"name":"Elise", "gender":"female", "age":14 },
        {"name":"Luna", "gender":"female", "age":13 },
        {"name":"Crackers", "gender":"male", "age":13 },
        {"name":"Gracie", "gender":"female", "age":12 },
        {"name":"Pig", "gender":"female", "age":10 },
        {"name":"Zelda", "gender":"female", "age":2 },
        {"name":"Wednesday", "gender":"female", "age":1 },
    ],
}

result = docGen(b64, data, CLIENT_ID, CLIENT_SECRET)

You’ll note here that my data is hard-coded. In a real application, this would typically be dynamic—read from the file system, queried from a database, or sourced from any other location.

The result object contains a message representing the success or failure of the operation, the file extension for the result, and the base64 representation of the result. To turn that base64 string back into a file, decode it first:

b64_bytes = result["base64FileString"].encode('ascii')
binary_data = base64.b64decode(b64_bytes)

Most likely you’ll always be outputting PDFs, so here’s a simple bit of code that stores the result:

with open('../../output/docgen_sample.pdf', 'wb') as file:
    file.write(binary_data)
    print('Done and stored to ../../output/docgen_sample.pdf')

There’s a bit more to the API than I’ve shown here so be sure to check the docs, but now it’s time for the real star of this API, Word.

Using Word as a Template

I’ve probably used Microsoft Word for longer than you’ve been alive and I’ve never really thought much about it. But when you begin to think of a simple Word document as a template, all of a sudden the possibilities begin to excite you. In our Document Generation API, the template system works via simple “tokens” in your document marked by opening and closing double brackets.

Consider this block of text:

See how name is surrounded by double brackets? And food and favoriteMovie? When this template is sent to the API along with the corresponding values, those tokens are replaced dynamically. In the screenshot, notice how favoriteMovie is bolded. That’s fine. You can use any formatting, styling, or layout options you wish.

That’s one example, but you also get some built-in values as well. For example, including today as a token will insert the current date, and can be paired with date formatting to specify how the date looks:

Remember the array of cats from earlier? You can use that to create a table in Word like this:

Notice that I’ve used two new tags here, TableStart and TableEnd, both of which reference the array, cats. Then in my table cells, I refer to the values from that array. Again, the color you see here is completely arbitrary and was me making use of the entirety of my Word design skills.

Here’s the template as a whole to show you everything in context:

The Result

Given the code shown above with those values, and given the Word template just shared, once passed to the API, the following PDF is created:

What About Converting PDF to JSON?

So far we’ve been going one direction: JSON data in, PDF out. But what if you need to go the other way—extract structured content from a PDF and work with it in your application?

Foxit’s PDF Services API includes an Extract endpoint that handles exactly this. You upload a PDF, specify whether you want TEXT, IMAGE, or PAGE-level data, and the API returns the extracted content. The text output is particularly useful if you want to feed the result into a data pipeline, search index, or AI workflow.

Here’s a quick look at how extraction works in Python. First, upload your PDF:

def uploadDoc(path, id, secret):
    headers = {
        "client_id":id,
        "client_secret":secret
    }
    with open(path, 'rb') as f:
        files = {'file': (path, f)}
        request = requests.post(f"{HOST}/pdf-services/api/documents/upload", files=files, headers=headers)
    return request.json()

doc = uploadDoc("../../inputfiles/input.pdf", CLIENT_ID, CLIENT_SECRET)

Then call the Extract endpoint with the document ID and the type of content you want. The result comes back in a structured format you can parse, store, or pass along to other tools—including an LLM if you’re building an AI document pipeline.

You can read a full walkthrough in our PDF text extraction guide.

Ready to Try?

If this looks cool, be sure to check the docs for more information about the template language and API. Sign up for some free developer credentials and reach out on our developer forums with any questions.

If you’re building AI agents or LLM-powered workflows, Foxit also offers an MCP server that lets you connect your agents directly to Foxit PDF Services—so your AI tools can generate, extract, and process documents without any custom glue code.

Want the code? Get it on GitHub (Python).

If you are more of a Node person, check out that version. Get it on GitHub (Node.js).

Building Auditable, AI-Driven Document Workflows with Foxit APIs

Building Auditable, AI-Driven Document Workflows with Foxit APIs

We had an incredible time at API World 2025 connecting with developers, sharing ideas, and seeing how Foxit APIs power everything from AI-driven resume builders to interactive doodle apps. In this post, we’ll walk through the same hands-on workflow Jorge Euceda demoed live on stage—showing how to build an auditable, AI-powered document automation system using Foxit PDF Services and Document Generation APIs.

This year’s API World was packed with energy—and it was amazing meeting so many developers face-to-face at the Foxit booth. We spent three days trading ideas about document automation, AI workflows, and integration challenges.

Our team hosted a hands-on workshop and sponsored the API World Hackathon, where developers submitted 16 high-quality projects built with Foxit APIs. Submissions ranged from:

  • Automated legal-advice generators

  • Compatibility-rating apps that analyze your personality match

  • AI-powered resume optimizers that tailor your CV to dream-job descriptions

  • Collaborative doodle games that turn drawings into shareable PDFs

Each project offered a new perspective on what’s possible with Foxit APIs—and we loved seeing the creativity.

Among all the sessions, Jorge Euceda’s workshop stood out as a crowd favorite. It showed how to make AI document decisions auditable, explainable, and replayable using event sourcing and two key Foxit APIs. That’s exactly what we’ll walk through below.

Click here to grab the project overview file.

Prefer to follow along with the live session instead of reading step-by-step?
Watch Jorge’s complete “AI-Powered Resume to Report” presentation from API World 2025.
It includes every step shown below—plus real-time API responses.

What You’ll Build

A complete, auditable workflow:

Resume Upload → Extract Resume Data → AI Candidate Scoring → Generate HR Report → Event Store

This workshop is designed for technical professionals and managers who want to learn how to use application programming interfaces (APIs) and explore how AI can enhance document workflows. Attendees will get hands-on experience with Foxit’s PDF Services (extraction/OCR) and Document Generation APIs, and see how event sourcing turns AI decisions into an auditable, replayable ledger.

By the end, you’ll have a Python-based demo that extracts data from a PDF resume, analyzes it against a policy, and generates a polished HR Report PDF with a traceable event log.

Getting Set Up

To follow along, you’ll need:

  • Access to a terminal with a Python 3.9+ Environment and internet connectivity

  • Visual Studio Code or your preferred IDE

  • Basic familiarity with REST/JSON (helpful but not required)

 

  1. Install Dependencies
python -V
# virtual environment setup, requests installation
python3 -m venv myenv
source myenv/bin/activate
pip3 install requests
  1. Download the project’s zip file below

Project Source Code

Now extract the files somewhere in your computer, open in Visual Studio Code or your preferred IDE.

You may use any sample resume PDF for inputs/input_resume.pdf. A sample one is provided, but you may leverage any resume PDF you wish to generate a report on.

  1. Create a Foxit Account for credentials

Create a Free Developer Account now or navigate to our getting started guide, which will go over how to create a free trial.

Hands-On Walkthrough

Step 1 – Open the Project

Now that you’ve downloaded the workshop source code, navigate to the resume_to_report.py file, which will serve as our main entry point.

Once dependencies are installed and the ZIP file extracted, open your workspace and run:

python3 resume_to_report.py

You should see console logs showing:

  • An AI Report printed as JSON

  • A generated PDF (outputs/HR_Report.pdf)

  • An event ledger (outputs/events.json) with traceable actions

Step 2 — Inspect the outputs

Open the generated HR report to review:

  • Candidate name and phone

  • Overall fit score

  • Matching skills & gaps

  • Summary and policy reference in the footer

Then open events.json to see your audit trail—each entry captures the AI’s decision context.

{
  "eventType": "DecisionProposed",
  "traceId": "8d1e4df6-8ac9-4f31-9b3a-841d715c2b1c",
  "payload": {
    "fitScore": 82,
    "policyRef": "EvaluationPolicy#v1.0"
  }
}

This is your audit trail.

Step 3 — Replay & Explain a Policy Change

Replay demonstrates why event-sourcing matters:

  1. Edit inputs/evaluation_policy.json: add a hard requirement (e.g., "kubernetes") or adjust the job_description emphasis.

  2. Re-run the script with the same resume.

  3. Compare:

    • New decision and updated PDF content

    • Event log now reflects the updated rationale (PolicyLoaded snapshot → new DecisionProposed with the same traceId lineage)

  4. Emphasize: The input resume hasn’t changed; only policy did — the event ledger explains the difference.

Policy: Drive Auditable & Replayable Decisions

The AI assistant uses a JSON policy file to control how it scores, caps, and summarizes results. Every policy snapshot is logged as its own event, creating a replayable audit trail for governance and compliance.

 

{
  "policyId": "EvaluationPolicy#v1.0",
  "job_description": "Looking for a software engineer with expertise in C++, Python, and AWS cloud services. Experience building scalable applications in agile teams; familiarity with DevOps and CI/CD.",
  "overall_summary": "Make the summary as short as possible",
  "hard_requirements": ["C++", "python", "aws"]
}

Notes:

  • policyId appears in both the report and event log.

  • job_description defines what the AI is looking for.

  • Changing these values creates a new traceable event.

Generate a Polished Report

Next, use the Foxit Document Generation API to fill your Word template and create a formatted PDF report.

Open inputs/hr_report_template.docx, you will find the following HR reporting template with placeholders for the fields we will be entering:

Tips:

  • Include lightweight branding (logo/header) to make the generated PDF presentation-ready.

  • Include a footer with traceable Policy ID and Trace ID Events

Results and Audit Trail

Here’s what the final HR Report PDF looks like:

Every decision has a Trace ID and Policy Ref, so you can recreate the report at any time and verify how the AI arrived there.

Why Event-Sourced AI Matters

This pattern does more than score resumes—it proves that AI decisions can be transparent, deterministic, and trustworthy.
By using Foxit APIs to extract, analyze, and generate documents, developers can bring auditability to any workflow that relies on machine logic.

Key Takeaways

  • Auditability – Every AI step emits a verifiable event.

  • Replayability – Change a policy and regenerate for deterministic results.

  • Explainability – Decisions carry policy and trace references for clear “why.”

  • Automation – PDF Services and Document Generation handle the document lifecycle end-to-end.

Try It Yourself

Ready to build your own auditable AI workflow?

Closing Thought

At API World, we set out to show how Foxit APIs can power real, transparent AI workflows—and the community response was incredible. Whether you’re building for HR, legal, finance, or creative industries, the same pattern applies:

Make your AI explain itself.

Start with the Foxit APIs, experiment with policies, and turn every AI decision into a traceable event that builds trust.