Stirling-PDF Advanced Features: Beyond Basic PDF Tools
Most people discover Stirling-PDF for basic operations: merge, split, compress, convert. These work excellently, but Stirling-PDF has depth beyond the basics. This guide covers the less-obvious but powerful features.
Photo by William Warby on Unsplash
Recap: What Stirling-PDF Is
Stirling-PDF is a self-hosted web application providing 50+ PDF operations. It runs entirely locally — no files leave your server. Operations include: merge, split, compress, rotate, convert (Office docs, images, HTML), OCR, watermark, crop, extract pages, reorder, annotate, sign, and more.
Docker Setup
If you haven't deployed it yet:
services:
stirling-pdf:
image: frooodle/s-pdf:latest
container_name: stirling-pdf
restart: unless-stopped
ports:
- 8080:8080
volumes:
- ./config:/configs
- ./training-data:/usr/share/tessdata # OCR language data
- ./customFiles:/customFiles
environment:
DOCKER_ENABLE_SECURITY: "false"
INSTALL_BOOK_AND_ADVANCED_HTML_CONVERSION: "false"
Set DOCKER_ENABLE_SECURITY: "true" to enable login authentication if the instance is network-accessible.
OCR: Making PDFs Searchable
OCR converts scanned images (PDFs that are essentially photos of documents) into searchable and selectable text.
Navigate to: Convert → PDF to PDF/OCR (or Operations → OCR)
Options:
- Language: Select the document's language (requires tessdata for that language — see below)
- OCR Type: "Add text layer" (keeps original image, adds hidden searchable text) vs "OCR only" (replaces pages with text)
- Deskew: Correct slight rotation in scanned documents
- Clean: Basic image cleanup before OCR
Installing additional languages:
OCR uses Tesseract. The Docker image includes English by default. For other languages:
# Inside the container or via Docker exec
apt-get install tesseract-ocr-deu # German
apt-get install tesseract-ocr-fra # French
apt-get install tesseract-ocr-spa # Spanish
Or mount tessdata files directly:
# Download language files from GitHub/tessdata
wget https://github.com/tesseract-ocr/tessdata/raw/main/deu.traineddata
# Place in ./training-data/ directory
Like what you're reading? Subscribe to Self-Hosted Weekly — free weekly guides in your inbox.
PDF/A Conversion for Long-Term Archiving
PDF/A is an ISO standard for long-term document preservation. It embeds all fonts, prohibits JavaScript and external references, and ensures the document renders identically regardless of software.
Navigate to: Convert → PDF to PDF/A
Use cases: archiving contracts, legal documents, medical records, government forms — anything you need to be readable in 20-30 years.
PDF/A-2B is the most common target format; PDF/A-3B is newer and supports attachments.
Flatten Forms and Annotations
PDFs with form fields can have their fields flattened — converting interactive fields to static text. This creates a non-editable version that looks like a filled form.
Navigate to: Operations → Flatten (Annotations/Form Fields)
Useful for: submitting completed forms via email where you want them non-editable, archiving filled forms, sending finalized documents.
Redact (Remove) Sensitive Content
Stirling-PDF supports redacting specific text from a PDF. Unlike "drawing a black box" over text in a PDF viewer (which is reversible), proper redaction removes the underlying text data.
Navigate to: Operations → Remove Content (Redact)
Enter the text strings to redact. All instances are removed from the document. This is important for removing SSNs, account numbers, or other sensitive data before sharing documents.
Split by Content
Beyond basic page splitting, Stirling-PDF can split a PDF:
- By page number ranges: Pages 1-5, 6-10, etc.
- By each page: Every page becomes a separate PDF
- By size: Split when accumulated pages exceed a file size
- By chapter/bookmark: Uses the document's bookmark structure
Navigate to: Organize → Split by Pages or Split PDF
Compress with Quality Control
PDF compression has several modes:
Navigate to: Transform → Compress PDF
Options:
- Low/Medium/High compression: Tradeoff between file size and quality
- DPI reduction: Downsample embedded images to a target resolution
- Compress before/after other operations: Chaining operations
For scanned PDFs with high-resolution images, compressing at 150 DPI is often sufficient for screen viewing and dramatically reduces file size.
API Access for Automation
Stirling-PDF has a full REST API. This enables automation:
# Merge multiple PDFs via API
curl -X POST \
-F '[email protected]' \
-F '[email protected]' \
http://your-server:8080/api/v1/general/merge-pdfs \
--output merged.pdf
# Compress a PDF
curl -X POST \
-F '[email protected]' \
-F 'optimizeLevel=3' \
http://your-server:8080/api/v1/misc/compress-pdf \
--output compressed.pdf
# Run OCR on a scanned PDF
curl -X POST \
-F '[email protected]' \
-F 'languages=eng' \
http://your-server:8080/api/v1/misc/add-ocr-pdf \
--output searchable.pdf
The Swagger documentation for all API endpoints is available at http://your-server:8080/swagger-ui/index.html.
Automation with n8n or workflows: Chain Stirling-PDF API calls in n8n or shell scripts to build document processing pipelines — automatically OCR incoming PDFs, compress files over a size threshold, or batch convert Office documents.
Security Configuration
If your Stirling-PDF instance is accessible beyond your localhost:
environment:
DOCKER_ENABLE_SECURITY: "true"
SECURITY_ENABLELOGIN: "true"
SECURITY_INITIALLOGIN_USERNAME: admin
SECURITY_INITIALLOGIN_PASSWORD: change-this
With security enabled, users must log in. Admin accounts can create additional users.
For team use, restrict access by IP range at the reverse proxy level rather than relying solely on application-level auth.
Integrating with Paperless-NGX
If you use Paperless-NGX for document management, Stirling-PDF can preprocess documents before ingestion:
- Use Stirling-PDF to OCR scanned PDFs (Paperless will use the text layer for search)
- Compress large PDFs before adding to Paperless to reduce storage
- Merge related documents before they're filed
You can script this with Stirling-PDF's API and Paperless's watch folder.
Custom Branding (For Organization/Team Use)
Stirling-PDF supports custom branding:
environment:
APP_NAME: "Your Org PDF Tools"
HOME_PAGE_DISPLAYED: "true"
Mount custom CSS or logo in ./customFiles/static/ to override the default appearance.
The project is at Frooodle/Stirling-PDF with active development. The API documentation and feature list expand regularly.
