How to Generate PDFs with Puppeteer in Node.js (And Why You Might Want an Alternative)
A complete guide to PDF generation with Puppeteer — setup, configuration, common problems, and when to consider an API instead.
What is Puppeteer?
Puppeteer is a Node.js library that provides a high-level API to control headless Chrome. One of its most popular use cases is generating PDFs from HTML — you load a page, call page.pdf(), and get a PDF back.
It's simple to get started, but as you'll see, it comes with significant operational overhead.
Basic Puppeteer PDF generation
const puppeteer = require('puppeteer');
async function generatePDF(html) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setContent(html, { waitUntil: 'networkidle0' });
const pdf = await page.pdf({
format: 'A4',
margin: { top: '20mm', right: '15mm', bottom: '20mm', left: '15mm' },
printBackground: true,
});
await browser.close();
return pdf;
}
This works for simple cases. But production systems need to handle:
The problems with Puppeteer
1. Memory consumption
Each Chromium instance consumes 100-300MB of RAM. If you're generating PDFs concurrently, you need a browser pool — and each browser in the pool needs its own memory allocation. A server generating 10 PDFs simultaneously needs 1-3GB just for the browsers.
2. Cold start latency
Launching a new browser takes 1-5 seconds. In a serverless environment (Lambda, Vercel), this happens on every request. Browser pooling helps on traditional servers, but adds complexity.
3. Docker image size
Chromium adds 400-600MB to your Docker image. This means slower deployments, higher storage costs, and longer cold starts. Many teams end up maintaining a separate "PDF worker" service just to isolate the Chromium dependency.
4. Font rendering inconsistencies
Chromium renders fonts differently depending on the OS, installed fonts, and anti-aliasing settings. A PDF that looks perfect on your Mac may have different font metrics on your Linux CI server.
5. Zombie processes
If your Node.js process crashes or times out, the Chromium process may not be cleaned up. Over time, zombie Chrome processes accumulate and consume all available memory. You need process management, health checks, and graceful shutdown handling.
The alternative: API-based PDF generation
Instead of running Chromium yourself, you can send your HTML to an API and get a PDF back:
const response = await fetch('https://pdfrelay.com/api/v1/convert', {
method: 'POST',
headers: {
'Authorization': 'Bearer sk_live_...',
'Content-Type': 'application/json',
},
body: JSON.stringify({
source: 'html',
content: html,
}),
});
const pdfBuffer = await response.arrayBuffer();
No Chromium. No Docker. No memory management. No zombie processes. The HTML you already know how to write becomes a perfect PDF in milliseconds — rendered by a native Rust engine, not a 600MB browser binary.
When to use Puppeteer
Puppeteer still makes sense when you need to:
- Screenshot dynamic JavaScript-heavy SPAs
- Scrape web pages that require JavaScript execution
- Run end-to-end browser tests
For PDF generation from HTML/CSS? An API is faster, cheaper, and more reliable. Your HTML and CSS skills are all you need — no proprietary markup, no coordinate systems, just the web standards you already know.