TubeArchivist: Self-Hosted YouTube Archive and Media Server
YouTube videos disappear constantly — channels get banned, creators delete content, or simply stop paying attention. If you've relied on a video tutorial, documentary, or series, you know the frustration of a dead link.
Photo by Jason Leung on Unsplash
TubeArchivist is a self-hosted YouTube media server. It downloads videos from channels and playlists you choose, stores them locally, and gives you a clean web interface to browse and watch your archive — no ads, no algorithm, no disappearing content.
What TubeArchivist Does
TubeArchivist is not a download manager. It's a full media server built around YouTube content:
- Channel subscriptions: Add channels and automatically archive new uploads on a schedule
- Playlist support: Download and track entire playlists as organized collections
- Full-text search: Searches video titles, descriptions, and auto-generated transcripts
- Metadata preservation: Stores thumbnails, descriptions, upload dates, and channel info
- Sponsorblock integration: Automatically marks or skips sponsored segments
- Progress tracking: Resume videos where you left off, track watch history
The backend uses yt-dlp for downloading and Elasticsearch for search indexing. The frontend is a React SPA with a clean, YouTube-inspired layout.
Hardware Requirements
TubeArchivist is resource-intensive because of Elasticsearch:
| Component | Minimum | Recommended |
|---|---|---|
| RAM | 4GB | 8GB+ |
| CPU | 2 cores | 4+ cores |
| Storage | Depends on library size | Plan for 2–10GB per 100 videos |
A 720p video averages ~500MB–1GB. A 1080p video is 1–3GB. If you're archiving hundreds of videos, plan storage accordingly. A TrueNAS array or mergerfs+SnapRAID setup pairs well with TubeArchivist.
Installation with Docker Compose
TubeArchivist requires three containers: the app itself, Redis (for task queuing), and Elasticsearch (for search).
Create docker-compose.yml:
services:
tubearchivist:
image: bbilly1/tubearchivist:latest
container_name: tubearchivist
ports:
- "8000:8000"
environment:
- ES_URL=http://archivist-es:9200
- REDIS_HOST=archivist-redis
- HOST_UID=1000
- HOST_GID=1000
- TA_USERNAME=admin
- TA_PASSWORD=yourpassword
- ELASTIC_PASSWORD=elasticpassword
- TZ=America/Los_Angeles
volumes:
- /data/tubearchivist/media:/youtube
- /data/tubearchivist/cache:/cache
depends_on:
- archivist-es
- archivist-redis
restart: unless-stopped
archivist-redis:
image: redis/redis-stack-server:latest
container_name: archivist-redis
volumes:
- /data/tubearchivist/redis:/data
restart: unless-stopped
archivist-es:
image: elasticsearch:8.13.0
container_name: archivist-es
environment:
- ELASTIC_PASSWORD=elasticpassword
- ES_JAVA_OPTS=-Xms512m -Xmx512m
- xpack.security.enabled=true
- discovery.type=single-node
- path.repo=/usr/share/elasticsearch/data/snapshot
volumes:
- /data/tubearchivist/es:/usr/share/elasticsearch/data
restart: unless-stopped
Create the directories and start:
mkdir -p /data/tubearchivist/{media,cache,redis,es}
docker compose up -d
The first startup takes a few minutes as Elasticsearch initializes. Access the UI at http://your-server:8000.
Like what you're reading? Subscribe to Self-Hosted Weekly — free weekly guides in your inbox.
Adding Channels and Playlists
Once logged in:
Add a channel: Click the + button and paste a YouTube channel URL. TubeArchivist will fetch channel metadata and optionally start downloading videos.
Set download settings: In Settings → Download, configure:
- Video format:
bestvideo[height<=1080]+bestaudiofor 1080p max - Download schedule: Cron expression for automatic updates (e.g.,
0 3 * * *for 3am daily) - Auto-delete: Remove videos older than N days if storage is limited
- Video format:
Queue a download: Go to a channel and click Download to queue all videos, or select specific ones.
Monitor progress: The Downloads tab shows active and queued tasks with progress indicators.
Download Quality Configuration
TubeArchivist passes format strings directly to yt-dlp. Common configurations:
# Best quality up to 1080p (recommended for storage efficiency)
bestvideo[height<=1080][ext=mp4]+bestaudio[ext=m4a]/best[height<=1080]
# 720p only (significantly smaller files)
bestvideo[height<=720][ext=mp4]+bestaudio[ext=m4a]/best[height<=720]
# Best available quality (4K if available, large files)
bestvideo+bestaudio/best
For most homelabs, 1080p is the sweet spot — watchable on any screen, reasonable storage consumption.
Setting Up Automatic Updates
To keep your archive current without manual intervention:
- Go to Settings → Scheduling
- Set a Download schedule (when to check for and download new videos)
- Set an optional Rescan schedule (re-index existing files)
TubeArchivist will check each subscribed channel on schedule and queue new uploads automatically.
SponsorBlock Integration
TubeArchivist can integrate with SponsorBlock to skip or mark sponsored segments:
- In Settings → Application, enable SponsorBlock
- Set which categories to skip: sponsor, selfpromo, interaction, intro, outro
- SponsorBlock data is fetched during download and stored with the video
In the player, marked segments are highlighted on the seek bar, and skipping is automatic if you've configured it.
Search and Transcripts
One of TubeArchivist's best features is full-text search across video content. It uses YouTube's auto-generated captions (when available) and stores them in Elasticsearch.
You can search for:
- Video titles and descriptions (always available)
- Spoken words in the video (requires captions)
- Channel names
This turns your archive into a searchable knowledge base — useful for technical tutorials where you remember what was said but not which video.
Exposing TubeArchivist Externally
For access outside your home network, put TubeArchivist behind a reverse proxy:
Caddy:
tubearchivist.yourdomain.com {
reverse_proxy localhost:8000
}
Traefik:
labels:
- "traefik.enable=true"
- "traefik.http.routers.tubearchivist.rule=Host(`tubearchivist.yourdomain.com`)"
- "traefik.http.routers.tubearchivist.entrypoints=websecure"
- "traefik.http.services.tubearchivist.loadbalancer.server.port=8000"
Consider adding authentication (Authelia, Authentik, or Caddy basic auth) before exposing it publicly — TubeArchivist's built-in auth is minimal.
Storage Considerations
Videos accumulate quickly. A few strategies for managing storage:
- Auto-delete old videos: Set a maximum age or count per channel in download settings
- Per-channel quality settings: Archive important channels at 1080p, others at 720p
- External storage: Mount a NAS share or additional drives at
/data/tubearchivist/media - Compression: TubeArchivist downloads what YouTube has — you can transcode after download with a separate job, but this trades CPU for storage
Alternatives
| Tool | Use Case |
|---|---|
| yt-dlp (CLI) | One-off downloads, scripting — no media server |
| Pinchflat | macOS app for YouTube archiving |
| Invidious | YouTube frontend proxy, no local storage |
| Grayjay | Multi-platform video aggregator |
TubeArchivist is the right choice when you want an ongoing, organized archive with a media server interface — not just ad-hoc downloading.
Summary
TubeArchivist fills a real need: a permanent, searchable local copy of YouTube content you care about. The setup is more involved than simpler self-hosted apps (Elasticsearch isn't lightweight), but the result is a proper media server for video content that won't disappear.
Start with a few channels you rely on for technical reference, confirm the setup works, then expand from there.
