Backup strategies for self-hosted services in a privacy-first era
A practical deep dive into backup strategies for self-hosted services — real examples, comparisons, and setup guides.
Backup strategies for self-hosted services in a privacy-first era
If you’ve read the latest on UK surveillance, you already know privacy isn’t a given. In a world where AI stacks get centralized by big players and “privacy by design” often feels aspirational, your self-hosted services are a lifeboat for your data—and a prime target for failure if you don’t defend them properly. Backup strategy isn’t a luxury; it’s the hinge that keeps your home lab usable when the power goes out, a disk dies, or someone shakes the metaphorical tree with ransomware. This piece is practical, anchored in today’s news context, and aimed at folks who run Nextcloud, Home Assistant, media servers, and lightweight VMs on a Raspberry Pi or a compact NAS.
Why this matters now
Two threads from the news cycle matter for backup thinking:
- Privacy and surveillance concerns: The “Surveillance is not safety” callouts emphasize that data governance is fragile. If you’re storing personal data on self-hosted services, you’re protecting more than uptime—you’re protecting privacy itself. A robust backup strategy reduces risk of data leakage during a breach or failure and makes it possible to restore a clean, known state after an incident.
- AI architectures and data flows: Modern AI capabilities (local caches, connected services, and in some cases AI-assisted automation) push you to rethink data footprints. Backup strategy becomes not just about data loss prevention but about maintaining control over data lineage, encryption keys, and configuration states that affect how your stack behaves when you restore.
Bottom line: your backup plan should be as privacy-conscious and resilient as your services. It should cover data, configs, secrets, and the ability to restore quickly, even if the primary environment is compromised or the cloud you rely on changes its terms.
1) What a solid backup plan looks like
I’ll break this into practical pieces you can implement in weeks, not months. It boils down to three core ideas:
- 3-2-1-0 (and why 0 matters): Keep at least three copies of each critical data item, on two different media, with one copy off-site or air-gapped. Add 0: periodic offline, immutable backups to guard against ransomware and corruption.
- Data classification and scope: Don’t back up every log forever. Prioritize databases, config and secrets, user data, media/assets, and logs that are hard to reconstruct. Don’t forget infrastructure-as-code and secrets.
- Verification and restore testing: Backups aren’t a checkbox. You must verify integrity and perform restores regularly to prove they actually work.
2) What to back up (and in what order)
A typical self-hosted stack might include:
- Data plane
- Databases (PostgreSQL, MySQL/MariaDB, SQLite in some apps)
- User data and media (Nextcloud disks, Plex / Jellyfin libraries, media caches)
- Application data and datastore volumes (Docker volumes, Kubernetes persistent volumes)
- Config plane
- Infrastructure as code (Git repos with Ansible, Terraform, Kubernetes manifests)
- Secrets (env files, Vault/Sops keys, Kubernetes Secrets)
- TLS/SSL certs (LE/ACME, PKI materials)
- Runtime and state
- Application state (home automation histories, calendar data, task state)
- Logs and analytics you want to preserve for auditing
- Optional but smart
- Email and message stores (Dovecot, Mailpile, Mailcow)
- Cronjobs and backup scripts themselves
3) Storage targets and architectural patterns
- Local + off-site
- Keep primary backups on a fast NAS or USB SSD for quick restores, plus an off-site copy (cloud or off-site hardware) for resiliency.
- Air-gapped/offline
- Periodically rotate a physical drive that is disconnected from the network. This is your “zero” copy to survive ransomware and supply-chain risks.
- Immutable/cloud
- Use cloud storage with object-lock / WORM features (e.g., S3 Object Lock, Wasabi immutability) for data that must survive accidental deletion or corruption.
- Snapshots where relevant
- If you’re running on ZFS/Btrfs or similar, use filesystem snapshots as fast, recoverable state points. They’re not a substitute for the 3-2-1 rule, but they speed recovery of misconfigurations and accidental modifications.
4) A practical plan you can implement this month
A concrete, actionable plan with a realistic stack (Nextcloud + MariaDB, Home Assistant, and a small VM lab) looks like this:
- Inventory and map
- List all services and which data directories/databases they own.
- Decide which assets require long-term retention (e.g., Nextcloud files, app config, DB).
- Decide backup frequency per asset: critical (hourly), important (daily), archival (weekly/monthly).
- Choose targets
- Primary backup: local NAS or USB HDDs (RAID is not a substitute for backups, but it helps uptime).
- Off-site: cloud bucket (S3-ish), or an off-site NAS you can reach via VPN.
- Offline: rotate a write-once drive monthly.
- Implement mechanics
- Database backups: point-in-time or full backups depending on RPO. Frequency high for busy apps, daily otherwise.
- File backups: incremental, with dedup and compression.
- Secrets/config: encrypted backups, plus a separate secure location for keys.
- Verification and restore drills
- Monthly test restore of a full service, including database restore and application startup in a sandbox.
- Verification: checksums and a quick smoke test (login, basic read of data, file integrity).
- Monitoring and alerting
- Alert on backup failures, space exhaustion, drift between intended vs. actual backups, and verification failures.
5) A concrete example you can copy
Below is a practical example you can adapt. It backs up a PostgreSQL database and several file trees, then uses a modern backup tool to keep incremental, encrypted backups safely stored.
Code: backup.sh (bash)
```bash
!/usr/bin/env bash
set -euo pipefail
CONFIG
BASE="/mnt/backup"
DATE=$(date +%F-%H-%M)
PGBU="/tmp/pg_all.sql"
RESTIC_REPO="${RESTIC_REPO:-s3:https://s3.amazonaws.com/mybucket/restic}"
RESTIC_PASSWORD="${RESTIC_PASSWORD:-changeme}"
DOCKER_VOLUMES="/srv/nextcloud /srv/homeassistant /srv/media"
PGUSER="${PGUSER:-postgres}"
PGPASS="${PGPASS:-}"
PGHOST="${PGHOST:-localhost}"
GPG_REC
Recommended products & services
Jellyfin
| Product | Notes | Link |
|---|---|---|
| Jellyfin | Link | |
| Emby | Link |
Backup
| Product | Notes | Link |
|---|---|---|
| Backblaze B2 | Affordable offsite object storage | Link |
| Wasabi | Affordable offsite object storage | Link |