Convert Office Docs to PDFs Automatically with Foxit PDF Services API

See how to build a powerful, automated workflow that converts Office documents (Word, Excel, PowerPoint) into PDFs. This step-by-step guide uses the Foxit PDF Services API, the Pipedream low-code platform, and Dropbox to create a seamless “hands-off” document processing system. We’ll walk through every step, from triggering on a new file to uploading the final PDF.
Convert Office Docs to PDFs Automatically with Foxit PDF Services API
With our REST APIs, it is now possible for any developer to set up an integration and document workflow using their language of choice. But what about workflow automations? Luckily, this is even simpler (of course, depending on platform) as you can rely on the workflow service to handle a lot the heavy lifting of whatever automation needs you may have. In this blog post, I’m going to demonstrate a workflow making use of Pipedream. Pipedream is a low-code platform that lets you build flexible workflows by piecing together various small atomic steps. It’s been a favorite of mine for some time now, and I absolutely recommend it. But note that what I’ll be showing here today could absolutely be done on other platforms, like n8n.
Want the televised version? Catch the video below:
Our Office Document to PDF Workflow
Our workflow is based on Dropbox folders and handles automatic conversion of Office docs to PDFs. To support that, it does the following:
- Listen for new files in a Dropbox folder
- Do a quick sanity check (is it in the input subdirectory and an Office file)
- Download the file to Pipedream
- Send it to Foxit via the Upload API
- Kick off the appropriate conversion based on the Office type
- Check status via the Status API
- When done, download the result to Pipedream
- And finally, push it up to Dropbox in an output subdirectory
Here’s a nice graphical representation of this workflow:

Before we get into the code, note that workflow platforms like Pipedream are incredibly flexible. When I build workflows with platforms like this I try to make each step as atomic, and focused as possible. I could absolutely have built a shorter, more compact version of this workflow. However, having it broken out like this makes it easier to copy and modify going forward (which is exactly how this one came about, it was based on a simpler, earlier version).
Ok, let's break it down, step-by-step.
Getting Triggered
In Pipedream, workflows begin with a trigger. While there are many options for this, my workflow uses a "New File From Dropbox" trigger. I logged into Dropbox via Pipedream so it had access to my account. I then specified a top level folder, "Foxit", for the integration. Additionally, there are two more important settings:
- Recursive – this tells the trigger to file for any new file under the root directory, "Foxit". My Dropbox Foxit folder has both an input and output directory.
- Include Link – this tells Pipedream to ensure we get a link to the new file. This is required to download it later.

Filtering the Document Flow
The next two steps are focused on filtering and stopping the workflow, if necessary. The first, end_if_output
, is a built-in Pipedream step that lets me provide a condition for the workflow to end. First, I'll check the path value from the trigger (the path of the new file) and if it contains "output", this means it's a new file in the output directory and the workflow should not run.

The next filter is a code step that handles two tasks. First, it checks whether the new file is a supported Office type—.docx
, .xlsx
, or .pptx
—using our APIs. If the extension isn’t one of these, the workflow ends programmatically.
Later in the workflow, I’ll also need that same extension to route the request to the correct endpoint. So the code handles both: validation and preservation of the extension.
import os
def handler(pd: "pipedream"):
base, extension = os.path.splitext(pd.steps['trigger']['event']['name'])
if extension == ".docx":
api = "/pdf-services/api/documents/create/pdf-from-word"
elif extension == ".xlsx":
api = "/pdf-services/api/documents/create/pdf-from-excel"
elif extension == ".pptx":
api = "/pdf-services/api/documents/create/pdf-from-ppt"
else:
return pd.flow.exit(f"Exiting workflow due to unknow extension: {extension}.")
return { "api":api }
As you can see, if the extension isn't valid, I'm exiting the workflow using pd.flow.exit
(while also logging out a proper message, which I can check later via the Pipedream UI). I also return the right endpoint if a supported extension was used. This will be useful later in the flow.
Download and Upload API Data
The next two steps are primarily about moving data from the input source (Dropbox) to our API (Foxit).
The first step, download_to_tmp
, uses a simple Python script to transfer the Dropbox file into the /tmp
directory for use in the workflow
import requests
def handler(pd: "pipedream"):
download_url = pd.steps["trigger"]["event"]["link"]
file_path = f"/tmp/{pd.steps['trigger']['event']['name']}"
with requests.get(download_url, stream=True) as response:
response.raise_for_status()
with open(file_path, "wb") as file:
for chunk in response.iter_content(chunk_size=8192):
file.write(chunk)
return file_path
Notice at the end that I return the path I used in Pipedream. This action then leads directly into the next step of uploading to Foxit via the Upload API:
import os
import requests
def handler(pd: "pipedream"):
clientid = os.environ.get('FOXIT_CLIENT_ID')
secret = os.environ.get('FOXIT_CLIENT_SECRET')
HOST = os.environ.get('FOXIT_HOST')
headers = {
"client_id":clientid,
"client_secret":secret
}
with open(pd.steps['download_to_tmp']['$return_value'], 'rb') as f:
files = {'file': (pd.steps['download_to_tmp']['$return_value'], f)}
request = requests.post(f"{HOST}/pdf-services/api/documents/upload", files=files, headers=headers)
return request.json()
The result of this will be a documentId
value that looks like so:
{
"documentId": "<string>"
}
Pipedream lets you define environment variables and I've made use of them for my Foxit credentials and host. Grab your own free credentials here!
Converting the Document Using the Foxit API
The next step will actually kick off the conversion. My workflow supports three different input types (Word, PowerPoint, and Excel). These map to three API endpoints. But remember that earlier we sniffed the extension of our input and set the endpoint there. Since all three APIs work the same, that's literally all we need to do – hit the endpoint and pass the document value from the previous step.
import os
import requests
def handler(pd: "pipedream"):
clientid = os.environ.get('FOXIT_CLIENT_ID')
secret = os.environ.get('FOXIT_CLIENT_SECRET')
HOST = os.environ.get('FOXIT_HOST')
headers = {
"client_id":clientid,
"client_secret":secret,
"Content-Type":"application/json"
}
body = {
"documentId": pd.steps['upload_to_foxit']['$return_value']['documentId']
}
api = pd.steps['extension_check']['$return_value']['api']
print(f"{HOST}{api}")
request = requests.post(f"{HOST}{api}", json=body, headers=headers)
return request.json()
{
"taskId": "<string>"
}
Checking Your Document API Status
The next step is one that may take a few seconds – checking the job status. Foxit's endpoint returns a value like so:
{
"taskId": "<string>",
"status": "<string>",
"progress": "<int32>",
"resultDocumentId": "<string>",
"error": {
"code": "<string>",
"message": "<string>"
}
}
import os
import requests
from time import sleep
def handler(pd: "pipedream"):
clientid = os.environ.get('FOXIT_CLIENT_ID')
secret = os.environ.get('FOXIT_CLIENT_SECRET')
HOST = os.environ.get('FOXIT_HOST')
headers = {
"client_id":clientid,
"client_secret":secret,
"Content-Type":"application/json"
}
done = False
while done is False:
request = requests.get(f"{HOST}/pdf-services/api/tasks/{pd.steps['create_conversion_job']['$return_value']['taskId']}", headers=headers)
status = request.json()
if status["status"] == "COMPLETED":
done = True
return status
elif status["status"] == "FAILED":
print("Failure. Here is the last status:")
print(status)
return pd.flow.exit("Failure in job")
else:
print(f"Current status, {status['status']}, percentage: {status['progress']}")
sleep(5)
As shown, errors are simply logged by default—but you could enhance this by adding notifications, such as emailing an admin, sending a text message, or other alerts.
On success, the final output is passed along, including the key value we care about: resultDocumentId
.
Download and Upload – Again
Ok, if the workflow has gotten this far, it's time to finish the process. The next step handles downloading the result from Foxit using the download endpoint:
import requests
import os
def handler(pd: "pipedream"):
clientid = os.environ.get('FOXIT_CLIENT_ID')
secret = os.environ.get('FOXIT_CLIENT_SECRET')
HOST = os.environ.get('FOXIT_HOST')
headers = {
"client_id":clientid,
"client_secret":secret,
}
# Given a file of input.docx, we need to use input.pdf
base_name, _ = os.path.splitext(pd.steps['trigger']['event']['name'])
path = f"/tmp/{base_name}.pdf"
print(path)
with open(path, "wb") as output:
bits = requests.get(f"{HOST}/pdf-services/api/documents/{pd.steps['check_job']['$return_value']['resultDocumentId']}/download", stream=True, headers=headers).content
output.write(bits)
return {
"filename":f"{base_name}.pdf",
"path":path
}
Note that I'm using the base name
of the input, which is basically the filename minus the extension. So for example, input.docx
will become input
, which I then slap a pdf
extension on to create the filename used to store locally to Pipedream.
Finally, I push the file back up to Dropbox, but for this, I can use a built-in Pipedream step that can upload to Dropbox. Here's how I configured it:
- Path: Once again,
Foxit
- File Name: This one's a bit more complex, I want to store the value in the output subdirectory, and ensure the filename is dynamic. Pipedream lets you mix and match hard-coded values and expressions. I used this to enable that:
output/{{steps.download_result_to_tmp.$return_value.filename}}
. In this expression the portion inside the double bracket will be dynamic based on the PDF file generated previously. - File Path: This is an expression as well, pointing to where I saved the file previously:
{{steps.download_result_to_tmp.$return_value.path}}
- Mode: Finally, the mode attribute specifies what to do on a conflict. This setting will be based on whatever your particular workflow needs are, but for my workflow, I simply told Dropbox to overwrite the existing file.
Here's how that step looks configured in Pipedream:

Conclusion
Believe it or not, that's the entire workflow. Once enabled, it runs in the back ground and I can simply place any files into my Dropbox folder and my Office docs will be automatically converted. What's next? Definitely get your own free credentials and check out the docs to get started. If you run into any trouble at all, hit is up on the forums and we'll be glad to help!
How to Chain PDF Actions with Foxit

Performing a single action with the Foxit PDF Services API is straightforward, but what’s the best way to handle a sequence of operations? Instead of downloading and re-uploading a file for each step, you can chain actions together by passing the output of one job as the input for the next. This tutorial walks you through a complete Python example of how to build an efficient document optimization workflow that compresses and then linearizes a PDF.
How to Chain PDF Actions with Foxit
When working with Foxit’s PDF Services, you’ll remember that the basic flow involves:
- Uploading your document to Foxit to get an ID
- Starting a job
- Checking the job
- Downloading the result
This is handy for one off operations, for example, converting a Word document to PDF, but what if you need to do two or more operations? Luckily this is easy enough by simply handing off one result to the next. Let’s take a look at how this can work.
Credentials
Remember, to start developing and testing with the APIs, you’ll need to head over to our developer portal and grab a set of free credentials. This will include a client ID and secret values you’ll need to make use of the API.
If you would rather watch a video (or why not both?) – you can watch the walkthrough below:
Creating a Document Optimization Workflow
To demonstrate how to chain different operations together, we’re going to build a basic document optimization workflow that will:
- Compress the document by reducing image resolution and other compression algorithims.
- Linearize the document to make it better viewable on the web.
Given the basic flow described above, you may be tempted to do this:
- Upload the PDF
- Kick off the Compress job
- Check until done
- Download the compressed PDF
- Upload the PDF
- Kick off the Linearize job
- Check until done
- Download the compressed and linearized PDF
This wouldn’t require much code, but we can simplify the process by using the result of the compress job—once it’s complete—as the source for the linearize job. This gives us the following streamlined flow:
- Upload the PDF
- Kick off the Compress job
- Check until done
- Kick off the Linearize job
- Check until done
- Download the compressed and linearized PDF
Less is better! Alright, let’s look at the code.
First, here’s the typical code used to bring in our credentials from the environment, and define the Upload
job:
import os
import requests
import sys
from time import sleep
CLIENT_ID = os.environ.get('CLIENT_ID')
CLIENT_SECRET = os.environ.get('CLIENT_SECRET')
HOST = os.environ.get('HOST')
def uploadDoc(path, id, secret):
headers = {
"client_id":id,
"client_secret":secret
}
with open(path, 'rb') as f:
files = {'file': f}
request = requests.post(f"{HOST}/pdf-services/api/documents/upload", files=files, headers=headers)
return request.json()
def compressPDF(doc, level, id, secret):
headers = {
"client_id":id,
"client_secret":secret,
"Content-Type":"application/json"
}
body = {
"documentId":doc,
"compressionLevel":level
}
request = requests.post(f"{HOST}/pdf-services/api/documents/modify/pdf-compress", json=body, headers=headers)
return request.json()
def linearizePDF(doc, id, secret):
headers = {
"client_id":id,
"client_secret":secret,
"Content-Type":"application/json"
}
body = {
"documentId":doc
}
request = requests.post(f"{HOST}/pdf-services/api/documents/optimize/pdf-linearize", json=body, headers=headers)
return request.json()
Note that the compressPDF
method takes a required level
argument that defines the level of compression. From the docs, we can see the supported values are LOW
, MEDIUM
, and HIGH
.
Now, two more utility methods – one that checks the task returned by the API operations above and one that downloads a result to the file system:
def checkTask(task, id, secret):
headers = {
"client_id":id,
"client_secret":secret,
"Content-Type":"application/json"
}
done = False
while done is False:
request = requests.get(f"{HOST}/pdf-services/api/tasks/{task}", headers=headers)
status = request.json()
if status["status"] == "COMPLETED":
done = True
# really only need resultDocumentId, will address later
return status
elif status["status"] == "FAILED":
print("Failure. Here is the last status:")
print(status)
sys.exit()
else:
print(f"Current status, {status['status']}, percentage: {status['progress']}")
sleep(5)
def downloadResult(doc, path, id, secret):
headers = {
"client_id":id,
"client_secret":secret
}
with open(path, "wb") as output:
bits = requests.get(f"{HOST}/pdf-services/api/documents/{doc}/download", stream=True, headers=headers).content
output.write(bits)
input = "../../inputfiles/input.pdf"
print(f"File size of input: {os.path.getsize(input)}")
doc = uploadDoc(input, CLIENT_ID, CLIENT_SECRET)
print(f"Uploaded doc to Foxit, id is {doc['documentId']}")
task = compressPDF(doc["documentId"], "HIGH", CLIENT_ID, CLIENT_SECRET)
print(f"Created task, id is {task['taskId']}")
result = checkTask(task["taskId"], CLIENT_ID, CLIENT_SECRET)
print("Done converting to PDF. Now doing linearize.")
task = linearizePDF(result["resultDocumentId"], CLIENT_ID, CLIENT_SECRET)
print(f"Created task, id is {task['taskId']}")
result = checkTask(task["taskId"], CLIENT_ID, CLIENT_SECRET)
print("Done with linearize task.")
output = "../../output/really_optimized.pdf"
downloadResult(result["resultDocumentId"], output , CLIENT_ID, CLIENT_SECRET)
print(f"Done and saved to: {output}.")
print(f"File size of output: {os.path.getsize(output)}")
This code matches the flow described above, with the exception of outputting the size as a handy way to see the result of the compression call. When run, the initial size is 355994 bytes and the final size is 16733. That's a great saving! You should, however, ensure the result matches the quality you desire and if not, consider reducing the level of compression. Linearize doesn't impact the file size, but as stated above will make it work nicer on the web.
For a complete listing, find the sample on our GitHub repo.
Next Steps
Obviously, you could do even more chaining based on the code above. For example, as part of your optimization flow, you could even split the PDF to return a 'sample' of a document that may be for sale. You could extract information to use for AI purposes and more. Dig more into our PDF Service APIs to get an idea and let us know what you build on our developer forums!
How to Extract Text from PDFs using Foxit’s REST APIs

Want to extract text from PDF files with just a few lines of Python? This guide shows how to use Foxit’s REST Extract API to pull text content from PDFs, ideal for search, automation, or AI workflows. From setting up credentials to searching for keywords across multiple files, this post walks through the full process with example code and GitHub demos.
How to Extract Text from PDFs using Foxit’s REST APIs
PDFs are an excellent way to store information—they combine text, images, and more in a perfectly laid-out, eye-catching design that fulfills every marketer’s wildest dreams. But sometimes you just need the text! There’s a variety of reasons you may want to convert a rich PDF document into plain text:
- For indexing in a search engine
- To search documents for keywords
- To pass to generative AI services for introspection
Let’s take a look at the Extract API to see just how easy this is.
Start Here: Obtain Free Credentials to Use the Foxit API
Before we go any further, head over to our developer portal and grab a set of free credentials. This will include a client ID and secret values – you’ll need both to make use of the API.
Rather watch the movie version? Check out the video below:
Foxit PDF API Workflow Overview with Python
The API follows the same format as the rest of our PDF Services in that you upload your input, kick off the job, check the job’s status, and download the result. As we’ve covered this a few times now on the blog (see my introductory post, we’ll skip over the details of uploading the document and loading in credentials. Here’s the Python code we’ve demonstrated before showing this in action:
CLIENT_ID = os.environ.get('CLIENT_ID')
CLIENT_SECRET = os.environ.get('CLIENT_SECRET')
HOST = os.environ.get('HOST')
def uploadDoc(path, id, secret):
headers = {
"client_id":id,
"client_secret":secret
}
with open(path, 'rb') as f:
files = {'file': (path, f)}
request = requests.post(f"{HOST}/pdf-services/api/documents/upload", files=files, headers=headers)
return request.json()
doc = uploadDoc("../../inputfiles/input.pdf", CLIENT_ID, CLIENT_SECRET)
print(f"Uploaded pdf to Foxit, id is {doc['documentId']}")
Now let's get into the meat of the Extract API. The API takes three arguments:
- The ID of the previously uploaded document.
- The type of information to extract—either TEXT, IMAGE, or PAGE. In theory, it should be pretty obvious what these do, but just in case: TEXT returns the text contents of the PDF. IMAGE gives you a ZIP file of images from the PDF. PAGE returns a new PDF containing just the page you requested.
- You can also pass in a page range, which can be a combo of specific pages and ranges. If you don’t include one, the entire PDF gets processed for extraction.
To make this simple to use, I've built a wrapper function that lets you pass these arguments:
def extractPDF(doc, type, id, secret, pageRange=None):
headers = {
"client_id":id,
"client_secret":secret,
"Content-Type":"application/json"
}
body = {
"documentId":doc,
"extractType":type
}
if pageRange:
body["pageRange"] = pageRange
request = requests.post(f"{HOST}/pdf-services/api/documents/modify/pdf-extract", json=body, headers=headers)
return request.json()
Literally, that's it. At this point, you get a task
object back that – like with our other APIs – can be checked for completion, and once it’s done, the results can be downloaded. Since we're working with text, though, let's simplify and just grab the text as a variable:
def getResult(doc, id, secret):
headers = {
"client_id":id,
"client_secret":secret
}
return requests.get(f"{HOST}/pdf-services/api/documents/{doc}/download", headers=headers).text
doc = uploadDoc("../../inputfiles/input.pdf", CLIENT_ID, CLIENT_SECRET)
print(f"Uploaded pdf to Foxit, id is {doc['documentId']}")
task = extractPDF(doc["documentId"], "TEXT", CLIENT_ID, CLIENT_SECRET)
print(f"Created task, id is {task['taskId']}")
result = checkTask(task["taskId"], CLIENT_ID, CLIENT_SECRET)
print(f"Final result: {result}")
text = getResult(result["resultDocumentId"], CLIENT_ID, CLIENT_SECRET)
print(text)
Searching PDFs for Keywords
# Get PDFs from our input directory
inputFiles = list(filter(lambda x: x.endswith('.pdf'), os.listdir('../../inputfiles')))
# Keyword to match on:
keyword = "Shakespeare"
for file in inputFiles:
doc = uploadDoc(f"../../inputfiles/{file}", CLIENT_ID, CLIENT_SECRET)
print(f"Uploaded pdf, {file}, to Foxit, id is {doc['documentId']}")
task = extractPDF(doc["documentId"], "TEXT", CLIENT_ID, CLIENT_SECRET)
result = checkTask(task["taskId"], CLIENT_ID, CLIENT_SECRET)
text = getResult(result["resultDocumentId"], CLIENT_ID, CLIENT_SECRET)
if keyword in text:
print(f"\033[32mThe pdf, {file}, matched on our keyword: {keyword}\033[0m")
else:
print(f"The pdf, {file}, did not match on our keyword: {keyword}")
print("")

What’s Next?
Introducing PDF APIs from Foxit

Get started with Foxit’s new PDF APIs—convert Word to PDF, generate documents, and embed files using simple, scalable REST APIs. Includes sample Python code and walkthrough.
Introducing PDF APIs from Foxit
At the end of June, Foxit introduced a brand-new suite of tools to help developers work with documents. These APIs cover a wide range of features, including:
- Convert between Office document formats and PDF files seamlessly
- Optimize, manipulate, and secure PDFs with advanced APIs
- Generate dynamic documents using Microsoft Word templates
- Extract text and images from PDFs with powerful tools
- Embed PDFs into web pages in a context-aware, controlled manner
- Integrate with eSign APIs for streamlined signature workflows
These APIs are simple to use, and best of all, follow the “don’t surprise me” principal of development. In this post, I’m going to demonstrate one simple example – converting a Word document to PDF – but you can rest assured that nearly all the APIs will follow incredibly similar patterns. I’ll be using Python for my examples here, but will link to a Node.js version of the same example. And given that we’re talking REST APIs here, any language is welcome to join the document party. Let’s dive in.
Credentials
Before we go any further, head over to our developer portal and grab a set of free credentials. This will include a client ID and secret values you’ll need to make use of the API.
Don’t want to read all of this? You can also follow along by video:
API Flow
As I mentioned above, most of the PDF Services APIs will follow a similar flow. This comes down to:
- Upload your input (like a Word document)
- Kick off a job (like converting to PDF)
- Check the job (hey, how ya doin?)
- Download the result
Or, in pretty graphical format –

The great thing is, once you’ve completed one integration (this post focuses on converting Word to PDF), switching to another is easy—and much of your existing code can be reused. A lazy developer is happy developer! Let’s get started.
Loading Credentials
My script begins by loading the credentials and API root host via the environment:
CLIENT_ID = os.environ.get('CLIENT_ID')
CLIENT_SECRET = os.environ.get('CLIENT_SECRET')
HOST = os.environ.get('HOST')
It’s never a good idea to hard-code credentials in your code. But if you do it this one time, I won’t tell. Honest.
Uploading Your Input
As I mentioned, in this example we’ll be making use of the Word to PDF API. Our input will be a Word document, which we’ll upload to Foxit using the upload API. This endpoint is fairly simple – aside from your credentials, all you need to provide is the binary data of the input file. Here’s the method I created to make this process easier:
def uploadDoc(path, id, secret):
headers = {
"client_id":id,
"client_secret":secret
}
with open(path, 'rb') as f:
files = {'file': (path, f)}
request = requests.post(f"{HOST}/pdf-services/api/documents/upload", files=files, headers=headers)
return request.json()
And here’s how it’s used:
doc = uploadDoc("../../inputfiles/input.docx", CLIENT_ID, CLIENT_SECRET)
print(f"Uploaded doc to Foxit, id is {doc['documentId']}")
The upload API only returns one value, a documentId
, which we can use in future calls.
Starting the Job
Each API operation is a job creator. By this I mean you call the endpoint and it begins your action. For Word to PDF, the only required input is the document ID from the previous call. We can build a nice little wrapper function like so:
def convertToPDF(doc, id, secret):
headers = {
"client_id":id,
"client_secret":secret,
"Content-Type":"application/json"
}
body = {
"documentId":doc
}
request = requests.post(f"{HOST}/pdf-services/api/documents/create/pdf-from-word", json=body, headers=headers)
return request.json()
And then call it like so:
task = convertToPDF(doc["documentId"], CLIENT_ID, CLIENT_SECRET)
print(f"Created task, id is {task['taskId']}")
The result of this call, if no errors were found, isa taskId
. We can use this to gauge how the job’s performing. Let’s do that now.
Job Checking
Ok, so the next part can be a bit tricky depending on your language of choice. We need to use the task status endpoint to determine how the job is performing. How often we do this, how quickly and so forth, will depend on your platform and needs. For our little sample script here, everything is running at once. I wrote a function that will check the status. If the job isn’t finished (whether successful or not), it pauses briefly before trying again. While this approach isn’t the most sophisticated, it should work well enough for basic testing:
def checkTask(task, id, secret):
headers = {
"client_id":id,
"client_secret":secret,
"Content-Type":"application/json"
}
done = False
while done is False:
request = requests.get(f"{HOST}/pdf-services/api/tasks/{task}", headers=headers)
status = request.json()
if status["status"] == "COMPLETED":
done = True
# really only need resultDocumentId, will address later
return status
elif status["status"] == "FAILED":
print("Failure. Here is the last status:")
print(status)
sys.exit()
else:
print(f"Current status, {status['status']}, percentage: {status['progress']}")
sleep(5)
As you can see, I’m using a while
loop that—at least in theory—will continue running until a success or failure response is returned, with a five-second pause between each call. You can adjust that interval as needed—test different values to see what works best for your use case. Typically, most API calls should complete in under ten seconds, so a five-second delay felt like a reasonable default.
Each call to the endpoint returns a task status result. Here’s an example:
{
'taskId': '685abc95a0d113558e4204d7',
'status': 'COMPLETED',
'progress': 100,
'resultDocumentId': '685abc952475582770d6917b'
}
The important part here is the status. But you could also use progress
to give some feedback to the code waiting for results. Here’s my code calling this:
result = checkTask(task["taskId"], CLIENT_ID, CLIENT_SECRET)
print(f"Final result: {result}")
Downloading Your Result
The last piece of the puzzle is simply saving the result. If you noticed above, the task returned a resultDocumentId
value. Taking that, and the [Download Document](NEED LINK) endpoint, we can build a utility to store the result like so:
def downloadResult(doc, path, id, secret):
headers = {
"client_id":id,
"client_secret":secret
}
with open(path, "wb") as output:
bits = requests.get(f"{HOST}/pdf-services/api/documents/{doc}/download", stream=True, headers=headers).content
output.write(bits)
And finally, call it:
downloadResult(result["resultDocumentId"], "../../output/input.pdf", CLIENT_ID, CLIENT_SECRET)
print("Done and saved to: ../../output/input.pdf")
And that’s it! While this script could certainly benefit from more robust error handling, it demonstrates the basic flow. As mentioned, most of our APIs follow this same logic.
Next Steps
Want the complete scripts? Get it on GitHub.
Want it in Node.js? Get it on GitHub.
Rather try this yourself? Sign up for a free developer account now. Need help? Head over to our developer forums and post your questions and comments.