Stirling-PDF: Self-Host Your Own PDF Toolkit
PDF work is surprisingly common — merging reports, splitting invoices, compressing files before upload, converting scanned documents to searchable PDFs. Cloud services like ilovepdf.com and smallpdf.com handle this, but they process your files on their servers. For sensitive documents — financial records, legal files, medical paperwork — self-hosting a PDF toolkit keeps your files private.
Photo by Rubén Bagüés on Unsplash
Stirling-PDF is a locally-hosted web application that provides a comprehensive PDF toolkit: merge, split, compress, convert, rotate, watermark, OCR, and 50+ other operations. Everything runs in Docker on your own server; no files ever leave your network.

What Stirling-PDF Can Do
The core operations cover virtually every PDF task you'd encounter:
Document manipulation:
- Merge multiple PDFs into one
- Split PDFs by page range, every N pages, or into individual pages
- Rotate, reorder, or delete specific pages
- Extract pages or page ranges to new files
Conversion:
- PDF to Word/Excel/PowerPoint (via LibreOffice)
- Office documents, images, HTML, Markdown → PDF
- PDF to images (PNG, JPEG, TIFF)
- Images to PDF with configurable layout
Optimization:
- Compress PDFs (up to 90% size reduction with ghostscript)
- Remove metadata, annotations, or embedded files
- Linearize PDFs for fast web viewing
Enhancement:
- Add text or image watermarks with configurable opacity and position
- Add headers and footers with page numbers, date, custom text
- Add passwords and permission controls (print-only, no-copy, etc.)
OCR (optical character recognition):
- Make scanned PDFs searchable using Tesseract OCR
- Available in 100+ languages
- Creates text layers without altering the visual appearance
Advanced:
- PDF/A conversion for long-term archival
- Crop, resize, flatten form fields
- Repair corrupted PDFs
- Compare two PDFs visually (highlight differences)
- Redact text or regions permanently
Why Self-Host Instead of Using Cloud Tools
| Aspect | Cloud Services | Stirling-PDF |
|---|---|---|
| Privacy | Files processed on vendor servers | Files never leave your server |
| Cost | Free with limits / $10-20/month for pro | Free after hosting costs |
| File size limits | Often 20-100MB | Configurable, often 2GB+ |
| Usage limits | Limited operations/day | Unlimited |
| Availability | Depends on internet | Works on local network |
| Sensitive docs | Trust vendor's privacy policy | You control everything |
For personal use with non-sensitive documents, cloud tools are fine. For business documents — contracts, financial reports, HR files — Stirling-PDF makes privacy control practical.
Installation
Stirling-PDF runs cleanly in Docker. The recommended deployment includes the full feature set (OCR, office document conversion):
# docker-compose.yml
services:
stirling-pdf:
image: frooodle/s-pdf:latest
ports:
- "8080:8080"
volumes:
- ./trainingData:/usr/share/tesseract-ocr/5/tessdata # OCR language data
- ./extraConfigs:/configs # custom config
- ./logs:/logs # optional logging
environment:
- DOCKER_ENABLE_SECURITY=false # enable true for auth
- INSTALL_BOOK_AND_ADVANCED_HTML_OPS=false # add Calibre/WKHTMLTOPDF
- LANGS=en_GB # OCR languages
With security enabled (recommended for public access):
environment:
- DOCKER_ENABLE_SECURITY=true
- SECURITY_ENABLELOGIN=true
- SECURITY_INITIALLOGIN_USERNAME=admin
- SECURITY_INITIALLOGIN_PASSWORD=yourpassword
Start with:
docker compose up -d
Access at http://your-server-ip:8080.
Like what you're reading? Subscribe to Self-Hosted Weekly — free weekly guides in your inbox.
OCR Language Support
Stirling-PDF uses Tesseract for OCR. By default, English is included. Additional languages require downloading their data files:
# Inside the container or on the host (mounted volume)
# Download German language data:
wget -P /usr/share/tesseract-ocr/5/tessdata \
https://github.com/tesseract-ocr/tessdata/raw/main/deu.traineddata
# French:
wget -P /usr/share/tesseract-ocr/5/tessdata \
https://github.com/tesseract-ocr/tessdata/raw/main/fra.traineddata
Available languages are listed at the tessdata GitHub repository.
Office Document Conversion
Converting between PDF and Office formats (Word, Excel, PowerPoint) requires LibreOffice, which is included in the main image but not the -ultra-lite variant:
# Use the full image for Office conversion support:
image: frooodle/s-pdf:latest
# Ultra-lite image (no office conversion, smaller download):
image: frooodle/s-pdf:latest-ultra-lite
Office conversions are slower than pure PDF operations — converting a 50-page Word document to PDF typically takes 5-15 seconds. Quality is generally good for text-heavy documents; complex layouts with precise positioning may shift slightly.
API Access
Stirling-PDF provides a REST API for automation. Every operation in the web UI has a corresponding API endpoint.
Example — compress a PDF via curl:
curl -X POST http://localhost:8080/api/v1/general/compress-pdf \
-F '[email protected]' \
-F 'optimizeLevel=3' \
-o compressed.pdf
Example — merge PDFs:
curl -X POST http://localhost:8080/api/v1/general/merge-pdfs \
-F '[email protected]' \
-F '[email protected]' \
-F '[email protected]' \
-o merged.pdf
The full API documentation is available at http://your-server:8080/swagger-ui/index.html — Stirling-PDF ships with Swagger UI built in.
Reverse Proxy Setup
For access over HTTPS with a custom domain:
Caddy:
pdf.yourdomain.com {
reverse_proxy stirling-pdf:8080
}
Nginx:
server {
listen 443 ssl;
server_name pdf.yourdomain.com;
location / {
proxy_pass http://stirling-pdf:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
# Large file uploads:
client_max_body_size 500M;
}
}
Set client_max_body_size (nginx) or request_body_size (Caddy) generously — PDF files, especially scanned documents, can be large.
Stirling-PDF vs. Alternatives
Gotenberg — API-only PDF service (no web UI). Better for automated/programmatic use cases integrated into applications. Not suitable as a user-facing tool.
LibreOffice headless — if you only need Office↔PDF conversion, running LibreOffice in headless mode is simpler. Stirling-PDF uses LibreOffice internally for this operation.
CloudConvert / ilovepdf — cloud services with generous free tiers. Appropriate for non-sensitive files and occasional use. Not appropriate for private documents.
PDF.js Express — client-side PDF viewer with some editing. Runs in the browser; limited manipulation capability.
Stirling-PDF wins when you need: web UI for non-technical users, comprehensive operations beyond conversion, privacy for sensitive files, or integration via API.
Resource Requirements
Stirling-PDF is relatively lightweight:
| Operation | CPU | RAM |
|---|---|---|
| Merge/split | Low | ~200MB |
| Compress (ghostscript) | Medium | ~300MB |
| OCR (Tesseract) | High | ~400MB |
| Office conversion (LibreOffice) | High | ~800MB |
For typical home or small business use, a 1-2 vCPU machine with 1-2GB RAM handles all operations without issue. Office conversions run one at a time and may queue under load.
Conclusion
Stirling-PDF is the most complete self-hosted PDF toolkit available. It covers operations that would otherwise require paid cloud subscriptions (PDF → Word conversion) or multiple specialized tools. The web UI is clean and approachable for non-technical users; the API enables automation.
If you work with PDFs regularly — especially sensitive ones — Stirling-PDF is worth the 10-minute Docker setup. Your documents stay on your server, you control access, and you get an unlimited PDF toolkit with no recurring fees.
