Files
EZ-Homelab/config-templates/dokuwiki/data/pages/architecture/backup.txt
kelinfoxy bcd20102ae Wiki v1.0
Added a wiki
2026-01-20 19:32:57 -05:00

293 lines
7.6 KiB
Plaintext

====== Backup Strategy ======
The AI-Homelab implements a comprehensive backup strategy designed for data protection, disaster recovery, and business continuity.
===== Backup Principles =====
**3-2-1 Rule:**
* **3 Copies**: Original + 2 backups
* **2 Media Types**: Different storage technologies
* **1 Offsite**: Geographic separation
**Recovery Objectives:**
* **RTO (Recovery Time Objective)**: Time to restore service
* **RPO (Recovery Point Objective)**: Maximum data loss acceptable
* **RTO Target**: < 4 hours for critical services
* **RPO Target**: < 1 hour for critical data
**Backup Types:**
* **Full**: Complete system backup
* **Incremental**: Changes since last backup
* **Differential**: Changes since last full backup
* **Snapshot**: Point-in-time copy
===== Backup Architecture =====
**Primary Backup Solution (Backrest):**
```yaml
services:
backrest:
image: ghcr.io/garethflowers/docker-backrest:latest
volumes:
- ./config/backrest:/config
- /mnt/backups:/backups
- /opt/stacks:/opt/stacks:ro
- /mnt:/mnt:ro
environment:
- BACKREST_CONFIG=/config/config.yml
- BACKREST_SCHEDULE=0 0 * * *
```
**Alternative Solution (Duplicati):**
```yaml
services:
duplicati:
image: lscr.io/linuxserver/duplicati:latest
volumes:
- duplicati-config:/config
- duplicati-source:/source:ro
- duplicati-backup:/backup
```
===== Backup Categories =====
**Configuration Backups:**
* **Source**: `/opt/stacks/*/`
* **Frequency**: Daily
* **Retention**: 30 days
* **Type**: Incremental
* **Critical**: Yes (service definitions)
**User Data Backups:**
* **Source**: `/mnt/media/`, `/mnt/nextcloud/`
* **Frequency**: Daily
* **Retention**: 90 days
* **Type**: Incremental
* **Critical**: Yes (user files)
**Database Backups:**
* **Source**: Named Docker volumes
* **Frequency**: Hourly
* **Retention**: 7 days
* **Type**: Snapshot
* **Critical**: Yes (application data)
**SSL Certificate Backups:**
* **Source**: `/opt/stacks/core/traefik/acme.json`
* **Frequency**: After renewal
* **Retention**: 1 year
* **Type**: Full
* **Critical**: Yes (HTTPS access)
===== Backup Configuration =====
**Backrest Configuration:**
```yaml
version: 1
schedule: "0 2 * * *" # Daily at 2 AM
repositories:
- path: "/backups/local"
retention: "30d"
- path: "/backups/remote"
retention: "90d"
backups:
- name: "stacks-config"
paths:
- "/opt/stacks"
exclude:
- "*.log"
- "*/cache/*"
- name: "user-data"
paths:
- "/mnt/media"
- "/mnt/nextcloud"
exclude:
- "*/temp/*"
- "*/cache/*"
```
**Duplicati Configuration:**
* **Source**: Local directories
* **Destination**: Local/network/cloud storage
* **Encryption**: AES-256
* **Compression**: ZIP
* **Deduplication**: Block-level
===== Storage Destinations =====
**Local Storage:**
* **Path**: `/mnt/backups/`
* **Type**: External HDD/SSD
* **Encryption**: Filesystem level
* **Access**: Direct mount
**Network Storage:**
* **Protocol**: NFS/SMB/CIFS
* **Location**: NAS device
* **Redundancy**: RAID protection
* **Security**: VPN access
**Cloud Storage:**
* **Providers**: AWS S3, Backblaze B2, Google Cloud
* **Encryption**: Client-side
* **Cost**: Pay for storage used
* **Access**: Internet connection
**Offsite Storage:**
* **Location**: Different geographic location
* **Transport**: Encrypted drives
* **Frequency**: Weekly rotation
* **Security**: Physical security
===== Encryption & Security =====
**Encryption Methods:**
* **Symmetric**: AES-256-GCM
* **Asymmetric**: RSA key pairs
* **Key Management**: Secure key storage
* **Key Rotation**: Regular key updates
**Security Measures:**
* **Access Control**: Restricted backup access
* **Network Security**: VPN for remote backups
* **Integrity Checks**: SHA-256 verification
* **Audit Logging**: Backup operation logs
===== Automation & Scheduling =====
**Cron Schedules:**
```bash
# Daily backups at 2 AM
0 2 * * * /usr/local/bin/backrest backup
# Weekly full backup on Sunday
0 3 * * 0 /usr/local/bin/backrest backup --full
# Monthly archive
0 4 1 * * /usr/local/bin/backrest archive
```
**Monitoring:**
* **Success/Failure**: Email notifications
* **Size Tracking**: Storage usage monitoring
* **Performance**: Backup duration tracking
* **Health Checks**: Integrity verification
===== Recovery Procedures =====
**File-Level Recovery:**
```bash
# List snapshots
restic snapshots
# Restore specific file
restic restore latest --target /tmp/restore --path /opt/stacks/config.yml
# Restore to original location
restic restore latest --target / --path /opt/stacks
```
**Volume Recovery:**
```bash
# Stop service
docker compose down
# Restore volume
docker run --rm -v restored-volume:/data -v /backups:/backup busybox tar xzf /backup/volume.tar.gz -C /
# Restart service
docker compose up -d
```
**System Recovery:**
1. **Boot from installation media**
2. **Restore base system**
3. **Install Docker**
4. **Restore configurations**
5. **Restore user data**
6. **Verify services**
===== Testing & Validation =====
**Regular Testing:**
* **Monthly**: File restoration tests
* **Quarterly**: Volume recovery tests
* **Annually**: Full system recovery
* **After Changes**: Configuration updates
**Validation Checks:**
```bash
# Verify backup integrity
restic check
# List backup contents
restic ls latest
# Compare file counts
find /original -type f | wc -l
restic ls latest | wc -l
```
**Performance Monitoring:**
* **Backup Duration**: Track completion times
* **Success Rate**: Monitor failure rates
* **Storage Growth**: Track backup size trends
* **Recovery Time**: Measure restoration speed
===== Disaster Recovery =====
**Disaster Scenarios:**
* **Hardware Failure**: Drive/server replacement
* **Data Corruption**: File system damage
* **Cyber Attack**: Ransomware recovery
* **Site Disaster**: Complete site loss
**Recovery Strategies:**
* **Cold Standby**: Pre-configured backup server
* **Cloud Recovery**: Infrastructure as Code
* **Data Center**: Professional recovery services
* **Insurance**: Cyber liability coverage
**Business Continuity:**
* **Critical Services**: < 1 hour RTO
* **Important Services**: < 4 hours RTO
* **Standard Services**: < 24 hours RTO
* **Acceptable Data Loss**: < 1 hour RPO
===== Cost Optimization =====
**Storage Costs:**
* **Local**: Low initial cost, high maintenance
* **Network**: Medium cost, shared resources
* **Cloud**: Pay-as-you-go, scalable
* **Offsite**: Security vs accessibility trade-off
**Optimization Strategies:**
* **Compression**: Reduce storage requirements
* **Deduplication**: Eliminate redundant data
* **Tiering**: Move old data to cheaper storage
* **Retention Policies**: Delete unnecessary backups
===== Compliance & Auditing =====
**Regulatory Requirements:**
* **Data Retention**: Industry-specific rules
* **Encryption Standards**: FIPS compliance
* **Access Logging**: Audit trail requirements
* **Testing Frequency**: Regulatory testing schedules
**Audit Procedures:**
* **Backup Logs**: Operation history
* **Access Logs**: Who accessed backups
* **Change Logs**: Configuration changes
* **Test Results**: Recovery test documentation
**Documentation:**
* **Procedures**: Step-by-step recovery guides
* **Contacts**: Emergency contact information
* **Dependencies**: Required resources and access
* **Testing**: Regular test schedules and results
This backup strategy ensures your homelab data remains protected and recoverable in any scenario.
**Next:** Explore [[services:start|Service Management]] or learn about [[development:start|Contributing]].