Introducing Anvil Document AI. Operate faster.Learn more.
Engineering

Build vs Buy: PDF Service

Author headshot
By Mang-Git Ng

Summary of pros and cons when considering building your own PDF service for generating and filling out PDFs.

Back to all articles
Post hero image

You can’t get away from PDFs

Every industry has to deal with PDF forms, whether it is industry specific documents like insurance Acord forms or more general government compliance forms, PDFs are the ubiquitous standard for representing digital versions of paper forms.

Despite the PDF being a digital file format, building software to manage it is not easy. To start, PDFs are terrible at transferring  structured data, the way most applications communicate today. PDFs were created for digitally representing documents for print; great for digital magazines and newspapers, but terrible if you need a computer to extract meaningful information from them. Manually developing software to work with PDFs often requires custom code to draw and render each new PDF. Consequently,  updating a PDF requires engineering, which  is both time consuming and costly.

Whether your software product is creating invoices, generating contracts or filling out standard PDF forms like W4s and I9s, managing PDFs is often a core product requirement but rarely a differentiator that helps your product stand out.

Traditional Methods of Creating PDFs

PDFs are  great for consumption but terrible for data transmission. We are going to focus on PDFs used for transferring data between parties, such as standard employment forms, Acord insurance documents or government financial forms like those required by the IRS and SEC. There are a couple of ways to create your own PDF service to fill these PDFs.

Use Chromium in headless mode

This is probably the most common, “roll your own” method of PDF creation. To create a PDF, you rely on Chrome (the web browser) to render and create the PDF document. Then you download and save the document to your own file storage system.

The steps

  1. Setup a server to run Chrome in headless mode (or use a service).
  2. Manually create a new HTML template or replicate an existing PDF in HTML.
  3. Render the HTML template in Chromium.
  4. Save the the HTML page to PDF.
  5. Download and store the resulting PDF in your own files system.

The Pros

  1. Many people do this and there are plenty of step-by-step tutorials.
  2. Cost effective (kind of, you still have server costs, development costs and a lot of maintenance costs).

The Cons

  1. Extremely slow - Have you ever had to wait 30s for a website to generate a PDF? That’s because it takes time to render a PDF.
  2. Not flexible - For each new PDF, an engineer will need to be involved in creating and setting up the template.
  3. Cost - This is compute intensive, and requires expensive engineering resources.

Use a PDF library

There are a handful of PDF libraries available in every language imaginable: python, javascript, ruby, golang, etc. The libraries are open- source and often a good option if you are not trying to generate PDFs at scale.

The steps

  1. Determine which library to use - many libraries only render PDFs (read) or create PDFs (write) but very few do both.
  2. Pull the library into your code repository
  3. Pull in all dependencies.
  4. Make sure the library can run in your application environment.
    1. Local, cloud, on prem.
    2. Docker, Kubernetes, Linux etc.
  5. Create some nice wrappers (helper functions, APIs) around the library.

The Pros

  1. Faster - Often these libraries are much more performant than using Chromium.
  2. Flexible - You have more flexibility in how you want to integrate the library.

The Cons

  1. Completeness - Very few libraries have the full suite of features to manage PDFs.
  2. Maintenance - The libraries are often poorly maintained, and you will have to manage version upgrades.
  3. Scalability - PDFs are clunky and slow, and oftentimes the libraries are good enough but don’t scale with high load.

Use a PDF API service

There are plenty of PDF generation services out there. Many of them have accessible pay-as-you-go pricing and creating a PDF can be as simple as an API request.

The Steps

  1. Sign up for service.
  2. Get developer keys.
  3. Create an HTML version of the PDF you would like to generate.
  4. Make API request with the HTML template.

The Pros

  1. It’s easy - usually the APIs are relatively easy to integrate.
  2. Cost effective - avoid up front development costs.

The Cons

  1. Often limited to just PDF generation from HTML.
  2. Control -  you give up control to the service provider.

Why Anvil’s PDF service is different

Anvil’s PDF service brings together the Pros listed above while minimizing the Cons. To start, Anvil’s PDF Services is comprehensive with three different ways to create PDFs over API.

  1. Fill out an existing PDF  - Simple and accessible.
  2. Generate a PDF with HTML- ultimate control.
  3. Generate a PDF with Markdown - nice balance between control and ease.

Anvil is also extremely easy to use. We provide a simple REST endpoint for you to generate and populate PDFs. Anvil also provides a web-based UI for templatizing existing PDFs into API fillable PDFs. Additionally there are plenty of pre-written client libraries for the most common programming languages like Python, Javascript, Java, Golang and C#.

Anvil is scalable. Anvil does not use Chromium and has invested significant resources into performance and scalability. This means you can truly build once and forget it, no long term maintenance. We regularly power the PDF creation for large technology companies and Fortune 500s.

Anvil is cost-effective. Anvil has an extremely accessible pay-as-you-go plan, and if you are only generating a few PDFs a month, you might not even need to pay.

Other reasons to consider Anvil

Anvil is the paperwork automation platform for builders. We created Anvil to help product teams build, deploy and scale software that automates paperwork; from data-collection, to document generation and e-signatures. Anvil is built to blend seamlessly with your product, from custom styled UI components to robust API endpoints and webhooks. Let Anvil help you build table stakes document automation features so you can focus on building value added features that make your product stand out.