Wiki v1.0

Added a wiki
This commit is contained in:
kelinfoxy
2026-01-20 19:32:57 -05:00
parent 16b7e1f1a7
commit bcd20102ae
31 changed files with 9283 additions and 0 deletions

View File

@@ -0,0 +1,293 @@
====== Backup Strategy ======
The AI-Homelab implements a comprehensive backup strategy designed for data protection, disaster recovery, and business continuity.
===== Backup Principles =====
**3-2-1 Rule:**
* **3 Copies**: Original + 2 backups
* **2 Media Types**: Different storage technologies
* **1 Offsite**: Geographic separation
**Recovery Objectives:**
* **RTO (Recovery Time Objective)**: Time to restore service
* **RPO (Recovery Point Objective)**: Maximum data loss acceptable
* **RTO Target**: < 4 hours for critical services
* **RPO Target**: < 1 hour for critical data
**Backup Types:**
* **Full**: Complete system backup
* **Incremental**: Changes since last backup
* **Differential**: Changes since last full backup
* **Snapshot**: Point-in-time copy
===== Backup Architecture =====
**Primary Backup Solution (Backrest):**
```yaml
services:
backrest:
image: ghcr.io/garethflowers/docker-backrest:latest
volumes:
- ./config/backrest:/config
- /mnt/backups:/backups
- /opt/stacks:/opt/stacks:ro
- /mnt:/mnt:ro
environment:
- BACKREST_CONFIG=/config/config.yml
- BACKREST_SCHEDULE=0 0 * * *
```
**Alternative Solution (Duplicati):**
```yaml
services:
duplicati:
image: lscr.io/linuxserver/duplicati:latest
volumes:
- duplicati-config:/config
- duplicati-source:/source:ro
- duplicati-backup:/backup
```
===== Backup Categories =====
**Configuration Backups:**
* **Source**: `/opt/stacks/*/`
* **Frequency**: Daily
* **Retention**: 30 days
* **Type**: Incremental
* **Critical**: Yes (service definitions)
**User Data Backups:**
* **Source**: `/mnt/media/`, `/mnt/nextcloud/`
* **Frequency**: Daily
* **Retention**: 90 days
* **Type**: Incremental
* **Critical**: Yes (user files)
**Database Backups:**
* **Source**: Named Docker volumes
* **Frequency**: Hourly
* **Retention**: 7 days
* **Type**: Snapshot
* **Critical**: Yes (application data)
**SSL Certificate Backups:**
* **Source**: `/opt/stacks/core/traefik/acme.json`
* **Frequency**: After renewal
* **Retention**: 1 year
* **Type**: Full
* **Critical**: Yes (HTTPS access)
===== Backup Configuration =====
**Backrest Configuration:**
```yaml
version: 1
schedule: "0 2 * * *" # Daily at 2 AM
repositories:
- path: "/backups/local"
retention: "30d"
- path: "/backups/remote"
retention: "90d"
backups:
- name: "stacks-config"
paths:
- "/opt/stacks"
exclude:
- "*.log"
- "*/cache/*"
- name: "user-data"
paths:
- "/mnt/media"
- "/mnt/nextcloud"
exclude:
- "*/temp/*"
- "*/cache/*"
```
**Duplicati Configuration:**
* **Source**: Local directories
* **Destination**: Local/network/cloud storage
* **Encryption**: AES-256
* **Compression**: ZIP
* **Deduplication**: Block-level
===== Storage Destinations =====
**Local Storage:**
* **Path**: `/mnt/backups/`
* **Type**: External HDD/SSD
* **Encryption**: Filesystem level
* **Access**: Direct mount
**Network Storage:**
* **Protocol**: NFS/SMB/CIFS
* **Location**: NAS device
* **Redundancy**: RAID protection
* **Security**: VPN access
**Cloud Storage:**
* **Providers**: AWS S3, Backblaze B2, Google Cloud
* **Encryption**: Client-side
* **Cost**: Pay for storage used
* **Access**: Internet connection
**Offsite Storage:**
* **Location**: Different geographic location
* **Transport**: Encrypted drives
* **Frequency**: Weekly rotation
* **Security**: Physical security
===== Encryption & Security =====
**Encryption Methods:**
* **Symmetric**: AES-256-GCM
* **Asymmetric**: RSA key pairs
* **Key Management**: Secure key storage
* **Key Rotation**: Regular key updates
**Security Measures:**
* **Access Control**: Restricted backup access
* **Network Security**: VPN for remote backups
* **Integrity Checks**: SHA-256 verification
* **Audit Logging**: Backup operation logs
===== Automation & Scheduling =====
**Cron Schedules:**
```bash
# Daily backups at 2 AM
0 2 * * * /usr/local/bin/backrest backup
# Weekly full backup on Sunday
0 3 * * 0 /usr/local/bin/backrest backup --full
# Monthly archive
0 4 1 * * /usr/local/bin/backrest archive
```
**Monitoring:**
* **Success/Failure**: Email notifications
* **Size Tracking**: Storage usage monitoring
* **Performance**: Backup duration tracking
* **Health Checks**: Integrity verification
===== Recovery Procedures =====
**File-Level Recovery:**
```bash
# List snapshots
restic snapshots
# Restore specific file
restic restore latest --target /tmp/restore --path /opt/stacks/config.yml
# Restore to original location
restic restore latest --target / --path /opt/stacks
```
**Volume Recovery:**
```bash
# Stop service
docker compose down
# Restore volume
docker run --rm -v restored-volume:/data -v /backups:/backup busybox tar xzf /backup/volume.tar.gz -C /
# Restart service
docker compose up -d
```
**System Recovery:**
1. **Boot from installation media**
2. **Restore base system**
3. **Install Docker**
4. **Restore configurations**
5. **Restore user data**
6. **Verify services**
===== Testing & Validation =====
**Regular Testing:**
* **Monthly**: File restoration tests
* **Quarterly**: Volume recovery tests
* **Annually**: Full system recovery
* **After Changes**: Configuration updates
**Validation Checks:**
```bash
# Verify backup integrity
restic check
# List backup contents
restic ls latest
# Compare file counts
find /original -type f | wc -l
restic ls latest | wc -l
```
**Performance Monitoring:**
* **Backup Duration**: Track completion times
* **Success Rate**: Monitor failure rates
* **Storage Growth**: Track backup size trends
* **Recovery Time**: Measure restoration speed
===== Disaster Recovery =====
**Disaster Scenarios:**
* **Hardware Failure**: Drive/server replacement
* **Data Corruption**: File system damage
* **Cyber Attack**: Ransomware recovery
* **Site Disaster**: Complete site loss
**Recovery Strategies:**
* **Cold Standby**: Pre-configured backup server
* **Cloud Recovery**: Infrastructure as Code
* **Data Center**: Professional recovery services
* **Insurance**: Cyber liability coverage
**Business Continuity:**
* **Critical Services**: < 1 hour RTO
* **Important Services**: < 4 hours RTO
* **Standard Services**: < 24 hours RTO
* **Acceptable Data Loss**: < 1 hour RPO
===== Cost Optimization =====
**Storage Costs:**
* **Local**: Low initial cost, high maintenance
* **Network**: Medium cost, shared resources
* **Cloud**: Pay-as-you-go, scalable
* **Offsite**: Security vs accessibility trade-off
**Optimization Strategies:**
* **Compression**: Reduce storage requirements
* **Deduplication**: Eliminate redundant data
* **Tiering**: Move old data to cheaper storage
* **Retention Policies**: Delete unnecessary backups
===== Compliance & Auditing =====
**Regulatory Requirements:**
* **Data Retention**: Industry-specific rules
* **Encryption Standards**: FIPS compliance
* **Access Logging**: Audit trail requirements
* **Testing Frequency**: Regulatory testing schedules
**Audit Procedures:**
* **Backup Logs**: Operation history
* **Access Logs**: Who accessed backups
* **Change Logs**: Configuration changes
* **Test Results**: Recovery test documentation
**Documentation:**
* **Procedures**: Step-by-step recovery guides
* **Contacts**: Emergency contact information
* **Dependencies**: Required resources and access
* **Testing**: Regular test schedules and results
This backup strategy ensures your homelab data remains protected and recoverable in any scenario.
**Next:** Explore [[services:start|Service Management]] or learn about [[development:start|Contributing]].

View File

@@ -0,0 +1,329 @@
====== Network Architecture ======
The AI-Homelab uses a sophisticated network architecture designed for security, performance, and scalability.
===== Network Topology =====
```
Internet
[Router/Firewall]
├── Port 80 (HTTP) → Traefik (Let's Encrypt)
├── Port 443 (HTTPS) → Traefik (SSL Termination)
└── Port 22 (SSH) → Server (Management)
[DuckDNS] Dynamic DNS
[Traefik] Reverse Proxy
├── Authelia SSO Middleware
├── Service Routing
└── SSL Termination
[Docker Networks]
├── traefik-network (Web Services)
├── homelab-network (Internal)
├── media-network (Media Services)
└── service-specific networks
```
===== Docker Networks =====
**traefik-network (Primary):**
* **Purpose**: All web-accessible services
* **Driver**: Bridge
* **IP Range**: 172.20.0.0/16
* **External Access**: Yes (via Traefik)
**homelab-network (Internal):**
* **Purpose**: Internal service communication
* **Driver**: Bridge
* **IP Range**: 172.21.0.0/16
* **External Access**: No
**media-network:**
* **Purpose**: Media service isolation
* **Driver**: Bridge
* **IP Range**: 172.22.0.0/16
* **External Access**: Via Traefik
**dockerproxy-network:**
* **Purpose**: Docker socket proxy
* **Driver**: Bridge
* **Security**: Restricted access
===== Traefik Routing =====
**Entry Points:**
```yaml
entryPoints:
web:
address: ":80"
http:
redirections:
entryPoint:
to: websecure
scheme: https
websecure:
address: ":443"
http:
tls:
certResolver: letsencrypt
```
**Router Configuration:**
```yaml
http:
routers:
service-router:
rule: "Host(`service.yourdomain.duckdns.org`)"
entryPoints:
- websecure
service: service-name
tls:
certResolver: letsencrypt
middlewares:
- authelia@docker
```
**Service Discovery:**
```yaml
http:
services:
service-name:
loadBalancer:
servers:
- url: "http://container-name:port"
```
===== SSL/TLS Configuration =====
**Certificate Resolver:**
```yaml
certificatesResolvers:
letsencrypt:
acme:
email: your-email@example.com
storage: /acme.json
dnsChallenge:
provider: duckdns
delayBeforeCheck: 30
```
**Wildcard Certificate:**
* **Domain**: `*.yourdomain.duckdns.org`
* **Provider**: Let's Encrypt
* **Challenge**: DNS-01 (DuckDNS)
* **Validity**: 90 days
* **Renewal**: Automatic
**Security Headers:**
```yaml
middlewares:
security-headers:
headers:
stsSeconds: 31536000
stsIncludeSubdomains: true
stsPreload: true
forceSTSHeader: true
contentTypeNosniff: true
browserXssFilter: true
referrerPolicy: "strict-origin-when-cross-origin"
permissionsPolicy: "geolocation=(), microphone=(), camera=()"
```
===== Authelia Integration =====
**SSO Middleware:**
```yaml
middlewares:
authelia:
forwardAuth:
address: "http://authelia:9091/api/verify?rd=https://auth.yourdomain.duckdns.org/"
trustForwardHeader: true
authResponseHeaders:
- "Remote-User"
- "Remote-Groups"
- "Remote-Name"
- "Remote-Email"
```
**Access Control Rules:**
```yaml
access_control:
default_policy: deny
rules:
- domain: "*.yourdomain.duckdns.org"
policy: two_factor
- domain: "jellyfin.yourdomain.duckdns.org"
policy: bypass
- domain: "plex.yourdomain.duckdns.org"
policy: bypass
```
===== VPN Integration =====
**Gluetun Network Mode:**
```yaml
services:
qbittorrent:
network_mode: "service:gluetun"
depends_on:
- gluetun
```
**Port Mapping:**
```yaml
gluetun:
ports:
- "8080:8080" # qBittorrent Web UI
- "6881:6881" # Torrent port
- "6881:6881/udp"
```
**VPN Routing:**
* **Provider**: Surfshark (configurable)
* **Protocol**: WireGuard/OpenVPN
* **Kill Switch**: Prevents IP leaks
* **Port Forwarding**: Automatic
===== Firewall Configuration =====
**UFW Rules (Automatic):**
```bash
# Allow SSH
sudo ufw allow ssh
# Allow HTTP/HTTPS
sudo ufw allow 80
sudo ufw allow 443
# Enable firewall
sudo ufw enable
# Default deny
sudo ufw default deny incoming
sudo ufw default allow outgoing
```
**Docker Security:**
* **No privileged containers**
* **Non-root user execution**
* **Minimal port exposure**
* **Network isolation**
===== External Service Proxying =====
**Traefik File Provider:**
```yaml
http:
routers:
external-service:
rule: "Host(`external.yourdomain.duckdns.org`)"
service: external-service
middlewares:
- authelia@docker
services:
external-service:
loadBalancer:
servers:
- url: "http://192.168.1.100:8123"
```
**Use Cases:**
* **Home Assistant** on Raspberry Pi
* **NAS devices** (TrueNAS, Unraid)
* **Network printers** and IoT devices
* **Legacy applications**
===== DNS Configuration =====
**DuckDNS Setup:**
* **Update Interval**: Every 5 minutes
* **API Token**: Stored in `.env`
* **Domains**: yourdomain.duckdns.org
* **Wildcard**: *.yourdomain.duckdns.org
**Pi-hole Integration:**
* **Upstream DNS**: Quad9, Cloudflare
* **Ad Blocking**: Enabled
* **Local DNS**: Service discovery
* **DHCP**: Optional
===== Network Troubleshooting =====
**Connectivity Issues:**
```bash
# Check network connectivity
ping -c 4 8.8.8.8
# Test DNS resolution
nslookup yourdomain.duckdns.org
# Check port forwarding
curl -I http://your-external-ip
```
**Docker Network Issues:**
```bash
# List networks
docker network ls
# Inspect network
docker network inspect traefik-network
# Check container connectivity
docker exec container-name ping traefik
```
**SSL Certificate Problems:**
```bash
# Check certificate
echo | openssl s_client -connect yourdomain.duckdns.org:443 -servername service.yourdomain.duckdns.org 2>/dev/null | openssl x509 -noout -subject -dates
# View Traefik logs
docker logs traefik | grep certificate
```
**Authelia Issues:**
```bash
# Check Authelia logs
docker logs authelia
# Test authentication
curl -k https://auth.yourdomain.duckdns.org/api/state
```
===== Performance Optimization =====
**Connection Pooling:**
* **Keep-Alive**: Persistent connections
* **Connection Reuse**: Reduce overhead
* **Load Balancing**: Distribute traffic
**Caching:**
* **Browser Caching**: Static assets
* **Reverse Proxy**: Dynamic content
* **DNS Caching**: Pi-hole
**Compression:**
* **Gzip**: Text compression
* **Brotli**: Advanced compression
* **Media**: No compression (already compressed)
===== Monitoring =====
**Network Monitoring:**
* **Traefik Dashboard**: Routing metrics
* **Authelia Logs**: Authentication events
* **Pi-hole Stats**: DNS queries
* **Uptime Kuma**: Service availability
**Traffic Analysis:**
* **Request Logs**: Access patterns
* **Error Rates**: Service health
* **Response Times**: Performance metrics
* **Bandwidth Usage**: Network utilization
This network architecture provides secure, efficient, and scalable connectivity for all homelab services.
**Next:** Learn about [[architecture:security|Security Architecture]] or [[architecture:storage|Storage Strategy]].

View File

@@ -0,0 +1,298 @@
====== System Architecture ======
The AI-Homelab is built on a production-ready, scalable architecture designed for reliability, security, and ease of management.
===== Core Principles =====
**Infrastructure as Code:**
* All services defined in Docker Compose files
* Configuration managed through YAML files
* Version control with Git
* Reproducible deployments
**Security First:**
* SSO protection for admin interfaces
* Automatic HTTPS with Let's Encrypt
* VPN routing for downloads
* Network isolation and segmentation
**Scalability:**
* Resource limits prevent exhaustion
* Lazy loading reduces resource usage
* Modular service architecture
* Easy addition of new services
**Observability:**
* Comprehensive logging
* Metrics collection
* Health monitoring
* Alerting capabilities
===== Network Architecture =====
```
Internet
[Router] Port Forwarding (80, 443)
[DuckDNS] Dynamic DNS Updates
[Traefik] Reverse Proxy + SSL Termination
[Authelia] SSO Authentication
[Docker Services] Isolated Containers
```
**Network Layers:**
**External Access:**
* **DuckDNS**: Dynamic DNS service
* **Port Forwarding**: 80/443 to Traefik
* **SSL Termination**: Wildcard certificate
**Reverse Proxy:**
* **Traefik**: Routes traffic to services
* **Authelia**: SSO middleware
* **Load Balancing**: Service discovery
**Service Networks:**
* **traefik-network**: All web services
* **homelab-network**: Internal communication
* **media-network**: Media services
* **isolated networks**: Security segmentation
===== Service Architecture =====
**Core Stack (Essential Infrastructure):**
* **DuckDNS**: DNS updates every 5 minutes
* **Traefik**: HTTP routing and SSL
* **Authelia**: Authentication and authorization
* **Gluetun**: VPN client for downloads
* **Sablier**: Lazy loading service
**Infrastructure Stack:**
* **Dockge**: Primary management interface
* **Pi-hole**: Network-wide DNS and ad blocking
* **Dozzle**: Live Docker log viewer
* **Glances**: System resource monitoring
**Service Categories:**
**Media Services:**
* **Jellyfin/Plex**: Media servers with transcoding
* **qBittorrent**: Torrent client (VPN routed)
* **Sonarr/Radarr**: Download automation
* **Prowlarr**: Indexer management
**Productivity Services:**
* **Nextcloud**: File synchronization
* **Gitea**: Git service and CI/CD
* **BookStack**: Documentation platform
* **WordPress**: Blogging platform
**Monitoring & Observability:**
* **Grafana**: Dashboard and visualization
* **Prometheus**: Metrics collection
* **Uptime Kuma**: Status monitoring
* **Loki**: Log aggregation
===== Storage Architecture =====
**Configuration Storage:**
```
/opt/stacks/
├── core/ # Core infrastructure
├── infrastructure/ # Management tools
├── media/ # Media services
├── productivity/ # Office tools
└── monitoring/ # Observability
```
**Data Storage Strategy:**
**Small Data (< 50GB):**
* **Location**: `/opt/stacks/stack-name/config/`
* **Type**: Bind mounts
* **Backup**: Included in configuration backups
**Large Data (> 50GB):**
* **Location**: `/mnt/media/`, `/mnt/downloads/`, `/mnt/backups/`
* **Type**: External mounts
* **Backup**: Separate backup strategies
**Database Storage:**
* **Type**: Named Docker volumes
* **Location**: Docker managed
* **Backup**: Volume snapshots
===== Security Architecture =====
**Authentication & Authorization:**
**Authelia SSO:**
* **Protocol**: SAML, OpenID Connect
* **Storage**: File-based user database
* **2FA**: TOTP, WebAuthn support
* **Policies**: Domain-based access control
**Service Authentication:**
* **Admin Services**: Authelia protected
* **Media Services**: Bypass for app compatibility
* **APIs**: Token-based authentication
**Network Security:**
* **Firewall**: UFW with minimal ports
* **SSL/TLS**: End-to-end encryption
* **VPN**: Download traffic protection
* **Isolation**: Docker network segmentation
===== Deployment Architecture =====
**Two-Phase Deployment:**
**Phase 1: Setup**
```bash
sudo ./scripts/setup-homelab.sh
```
* System preparation
* Docker installation
* Authelia configuration
* Infrastructure setup
**Phase 2: Deployment**
```bash
sudo ./scripts/deploy-homelab.sh
```
* Core stack deployment
* SSL certificate generation
* Infrastructure services
* Health verification
**Service Deployment:**
* **Dockge**: Web-based stack management
* **Manual**: Docker Compose commands
* **Automated**: CI/CD pipelines
===== Resource Management =====
**Resource Limits:**
```yaml
deploy:
resources:
limits:
cpus: '2.0'
memory: 4G
reservations:
cpus: '0.5'
memory: 1G
```
**Resource Allocation Strategy:**
* **Core Services**: Minimal resources (0.1-0.5 CPU, 64MB-256MB RAM)
* **Web Services**: Moderate resources (1-2 CPU, 1-4GB RAM)
* **Media Services**: High resources (2-4 CPU, 4-8GB RAM)
* **Background Services**: Variable based on workload
**Lazy Loading:**
* **Sablier**: On-demand service startup
* **Resource Savings**: 50-80% reduction in idle usage
* **Automatic Scaling**: Services start when accessed
===== Monitoring Architecture =====
**Metrics Collection:**
* **Prometheus**: Time-series metrics
* **Node Exporter**: System metrics
* **cAdvisor**: Container metrics
* **Custom Exporters**: Service-specific metrics
**Logging:**
* **Dozzle**: Real-time log viewer
* **Loki**: Log aggregation
* **Promtail**: Log shipping
* **Structured Logging**: JSON format
**Alerting:**
* **Uptime Kuma**: Service availability
* **Grafana**: Threshold-based alerts
* **Email/SMS**: Notification channels
===== Backup Architecture =====
**Backup Strategy:**
* **Backrest**: Primary backup solution (Restic)
* **Duplicati**: Alternative encrypted backups
* **Automated**: Scheduled backups
* **Encrypted**: AES-256 encryption
**Backup Types:**
* **Configuration**: `/opt/stacks/` directories
* **User Data**: Service volumes and mounts
* **SSL Certificates**: `/opt/stacks/core/traefik/acme.json`
* **Databases**: Volume snapshots
**Recovery:**
* **Point-in-time**: Versioned backups
* **Bare metal**: Complete system recovery
* **Service-level**: Individual service restoration
===== High Availability =====
**Redundancy:**
* **Load Balancing**: Traefik distributes traffic
* **Health Checks**: Automatic service monitoring
* **Failover**: Automatic service restart
* **Backups**: Multiple backup locations
**Scalability:**
* **Horizontal**: Multiple service instances
* **Vertical**: Resource scaling
* **Storage**: Distributed storage options
* **Network**: High-bandwidth connections
===== Development Architecture =====
**AI Integration:**
* **GitHub Copilot**: Intelligent assistance
* **Copilot Instructions**: Context-aware guidance
* **Automated Configuration**: AI-generated compose files
* **Documentation**: AI-maintained wiki
**Version Control:**
* **Git**: Source code management
* **Branches**: Feature development
* **Tags**: Release versioning
* **CI/CD**: Automated testing and deployment
===== Performance Optimization =====
**Caching:**
* **Browser Caching**: Static asset optimization
* **Database Caching**: Query result caching
* **CDN**: Content delivery networks
* **Reverse Proxy**: Traefik caching
**Optimization Techniques:**
* **Compression**: Gzip/Brotli compression
* **Minification**: Asset optimization
* **Lazy Loading**: On-demand resource loading
* **Connection Pooling**: Database optimization
===== Compliance & Governance =====
**Security Standards:**
* **SSL/TLS**: Industry standard encryption
* **Access Control**: Least privilege principle
* **Audit Logging**: Comprehensive activity logs
* **Regular Updates**: Security patch management
**Data Protection:**
* **Encryption**: Data at rest and in transit
* **Backup Encryption**: Secure offsite storage
* **Privacy**: Minimal data collection
* **Retention**: Configurable data lifecycle
This architecture provides a solid foundation for a production-ready homelab that can scale with your needs while maintaining security and reliability.
**Next:** Learn about [[architecture:networking|Network Architecture]] or explore [[services:start|Available Services]].

View File

@@ -0,0 +1,299 @@
====== Security Architecture ======
The AI-Homelab implements a comprehensive security model based on defense in depth, zero trust principles, and industry best practices.
===== Security Principles =====
**Defense in Depth:**
* **Multiple Layers**: Network, application, and data security
* **Fail-Safe Defaults**: Secure by default, explicit opt-out
* **Least Privilege**: Minimal required permissions
* **Continuous Monitoring**: Real-time threat detection
**Zero Trust:**
* **Never Trust**: Verify every access request
* **Assume Breach**: Design for compromised systems
* **Micro-Segmentation**: Isolate services and data
* **Continuous Verification**: Ongoing authentication
**Compliance:**
* **Data Protection**: Encryption at rest and in transit
* **Access Control**: Role-based and attribute-based access
* **Audit Logging**: Comprehensive activity tracking
* **Regular Updates**: Security patch management
===== Authentication & Authorization =====
**Authelia SSO System:**
**Architecture:**
* **Protocol**: OpenID Connect, SAML 2.0
* **Storage**: File-based user database
* **Session Management**: Secure JWT tokens
* **Multi-Factor**: TOTP, WebAuthn, Push notifications
**User Management:**
```yaml
users:
admin:
displayname: Administrator
password: $argon2id$...
email: admin@yourdomain.duckdns.org
groups:
- admins
- dev
```
**Access Policies:**
```yaml
access_control:
default_policy: deny
rules:
# Admin services require 2FA
- domain: "*.yourdomain.duckdns.org"
policy: two_factor
subject:
- "group:admins"
# Media services bypass SSO
- domain: "jellyfin.yourdomain.duckdns.org"
policy: bypass
# API access with tokens
- domain: "*.yourdomain.duckdns.org"
policy: one_factor
resources:
- "^/api/.*"
```
**Session Security:**
* **Expiration**: 8 hour sessions
* **Inactivity Timeout**: 10 minute timeout
* **Secure Cookies**: HttpOnly, Secure, SameSite
* **CSRF Protection**: Token-based validation
===== SSL/TLS Encryption =====
**Certificate Management:**
* **Authority**: Let's Encrypt (trusted CA)
* **Type**: Wildcard ECDSA certificate
* **Domains**: *.yourdomain.duckdns.org
* **Renewal**: Automatic (30 days before expiry)
**SSL Configuration:**
```yaml
tls:
certificates:
- certFile: /ssl/cert.pem
keyFile: /ssl/private.key
options:
default:
minVersion: VersionTLS12
cipherSuites:
- TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
sniStrict: true
```
**Security Headers:**
```yaml
headers:
# Prevent clickjacking
customResponseHeaders:
X-Frame-Options: "SAMEORIGIN"
X-Content-Type-Options: "nosniff"
Referrer-Policy: "strict-origin-when-cross-origin"
Permissions-Policy: "geolocation=(), microphone=(), camera=()"
# HSTS (HTTP Strict Transport Security)
stsSeconds: 31536000
stsIncludeSubdomains: true
stsPreload: true
```
===== Network Security =====
**Firewall Configuration:**
* **UFW**: Uncomplicated Firewall
* **Default Policy**: Deny all incoming
* **Allowed Ports**: 22 (SSH), 80 (HTTP), 443 (HTTPS)
* **Docker Isolation**: Container network segmentation
**Network Segmentation:**
* **traefik-network**: Web-facing services
* **homelab-network**: Internal services
* **media-network**: Media services
* **isolated-networks**: High-security services
**VPN Protection:**
* **Gluetun**: VPN client container
* **Provider**: Surfshark (configurable)
* **Protocol**: WireGuard (preferred)
* **Kill Switch**: Prevents IP leaks
===== Container Security =====
**Docker Security Best Practices:**
* **Non-root Users**: PUID/PGID environment variables
* **No Privileged Containers**: Minimal capabilities
* **Read-only Filesystems**: Where possible
* **Resource Limits**: CPU and memory constraints
**Security Scanning:**
```yaml
# Trivy vulnerability scanning
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
aquasec/trivy image your-image:latest
# Container security audit
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
docker/docker-bench-security
```
**Image Security:**
* **Official Images**: LinuxServer.io preferred
* **Version Pinning**: Specific version tags
* **SBOM**: Software Bill of Materials
* **Signature Verification**: Image signing
===== Data Protection =====
**Encryption at Rest:**
* **SSL Certificates**: Encrypted storage
* **User Data**: Service-specific encryption
* **Backups**: AES-256 encryption
* **Secrets**: Environment variable protection
**Encryption in Transit:**
* **HTTPS**: End-to-end encryption
* **API Communication**: TLS 1.2+
* **Database Connections**: SSL/TLS
* **VPN Tunneling**: WireGuard/OpenVPN
**Data Classification:**
* **Public**: No encryption required
* **Internal**: TLS encryption
* **Sensitive**: Additional encryption layers
* **Critical**: Multi-layer encryption
===== Access Control =====
**Role-Based Access Control (RBAC):**
```yaml
# Authelia groups
groups:
admins:
- admin
users:
- user1
- user2
media:
- family
```
**Service-Level Permissions:**
* **Nextcloud**: User and group permissions
* **Gitea**: Repository access control
* **Grafana**: Dashboard permissions
* **API Keys**: Scoped access tokens
**Network Access Control:**
* **IP Whitelisting**: Restrict by IP address
* **Geo-blocking**: Country-based restrictions
* **Rate Limiting**: Prevent brute force attacks
* **Fail2Ban**: SSH protection
===== Monitoring & Auditing =====
**Security Monitoring:**
* **Authentication Logs**: Authelia events
* **Access Logs**: Traefik requests
* **System Logs**: Docker and system events
* **Intrusion Detection**: Pattern matching
**Audit Logging:**
```yaml
# Loki log aggregation
scrape_configs:
- job_name: 'authelia'
static_configs:
- targets: ['authelia:9091']
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: localhost:3100
```
**Alerting:**
* **Failed Logins**: Brute force detection
* **Certificate Expiry**: SSL renewal warnings
* **Service Downtime**: Availability monitoring
* **Security Events**: Suspicious activity
===== Threat Mitigation =====
**Common Threats:**
* **Brute Force**: Rate limiting, 2FA
* **SQL Injection**: Parameterized queries
* **XSS**: Content Security Policy
* **CSRF**: Token validation
**Incident Response:**
1. **Detection**: Monitoring alerts
2. **Assessment**: Determine impact
3. **Containment**: Isolate affected systems
4. **Recovery**: Restore from backups
5. **Lessons Learned**: Update policies
**Backup Security:**
* **Encryption**: AES-256-GCM
* **Integrity**: SHA-256 checksums
* **Retention**: Configurable policies
* **Testing**: Regular restoration tests
===== Compliance & Governance =====
**Security Standards:**
* **OWASP**: Web application security
* **NIST**: Cybersecurity framework
* **ISO 27001**: Information security
* **GDPR**: Data protection
**Regular Assessments:**
* **Vulnerability Scanning**: Weekly
* **Penetration Testing**: Monthly
* **Security Audits**: Quarterly
* **Compliance Reviews**: Annual
**Documentation:**
* **Security Policies**: Access and usage rules
* **Incident Response**: Procedures and contacts
* **Change Management**: Update procedures
* **Training**: Security awareness
===== Advanced Security =====
**Zero Trust Network Access (ZTNA):**
* **Identity-Based**: User and device verification
* **Context-Aware**: Risk-based access
* **Micro-Segmentation**: Service isolation
* **Continuous Monitoring**: Real-time assessment
**Secrets Management:**
* **Environment Variables**: Runtime secrets
* **Docker Secrets**: Swarm mode secrets
* **External Vaults**: HashiCorp Vault integration
* **Key Rotation**: Automatic secret renewal
**Intrusion Detection:**
* **Network IDS**: Traffic analysis
* **Host IDS**: System monitoring
* **Log Analysis**: Pattern detection
* **SIEM Integration**: Centralized logging
This security architecture provides comprehensive protection for your homelab while maintaining usability and performance.
**Next:** Learn about [[architecture:storage|Storage Strategy]] or [[architecture:backup|Backup Strategy]].

View File

@@ -0,0 +1,291 @@
====== Storage Architecture ======
The AI-Homelab implements a comprehensive storage strategy designed for performance, reliability, and scalability.
===== Storage Principles =====
**Data Classification:**
* **Configuration**: Application settings and metadata
* **User Data**: Files, documents, media
* **System Data**: Logs, caches, temporary files
* **Backup Data**: Archived copies and snapshots
**Storage Tiers:**
* **Hot**: Frequently accessed data (SSD)
* **Warm**: Regularly accessed data (HDD)
* **Cold**: Archive data (external storage)
* **Offline**: Long-term retention (tape/offsite)
**Performance Optimization:**
* **Caching**: In-memory data acceleration
* **Compression**: Storage space optimization
* **Deduplication**: Eliminate redundant data
* **Tiering**: Automatic data placement
===== Directory Structure =====
**System Storage (/opt/stacks/):**
```
/opt/stacks/
├── core/ # Core infrastructure
│ ├── traefik/ # Reverse proxy config
│ ├── authelia/ # SSO configuration
│ ├── duckdns/ # DNS updater
│ └── gluetun/ # VPN client
├── infrastructure/ # Management tools
├── media/ # Media services
├── productivity/ # Office applications
├── monitoring/ # Observability stack
└── utilities/ # Helper services
```
**Data Storage (/mnt/):**
```
/mnt/
├── media/ # Movies, TV, music
│ ├── movies/
│ ├── tv/
│ └── music/
├── downloads/ # Torrent downloads
│ ├── complete/
│ └── incomplete/
├── backups/ # Backup archives
├── nextcloud/ # Cloud storage
├── git/ # Git repositories
└── surveillance/ # Camera footage
```
===== Docker Storage =====
**Volume Types:**
**Named Volumes (Managed):**
```yaml
volumes:
database-data:
driver: local
```
* **Pros**: Docker managed, portable, backup-friendly
* **Cons**: Less direct access, filesystem overhead
* **Use**: Databases, application data
**Bind Mounts (Direct):**
```yaml
volumes:
- ./config:/config
- /mnt/media:/media
```
* **Pros**: Direct filesystem access, performance
* **Cons**: Host-dependent, permission management
* **Use**: Configuration, large media files
**tmpfs (Memory):**
```yaml
tmpfs:
- /tmp/cache
```
* **Pros**: High performance, automatic cleanup
* **Cons**: Volatile, memory usage
* **Use**: Caches, temporary files
**Storage Drivers:**
* **overlay2**: Modern union filesystem
* **btrfs**: Advanced features (snapshots, compression)
* **zfs**: Enterprise-grade (snapshots, deduplication)
===== Service Storage Patterns =====
**Configuration Storage:**
```yaml
services:
service-name:
volumes:
- ./config/service-name:/config
- service-data:/data
```
* **Config**: Bind mount in stack directory
* **Data**: Named volume for persistence
* **Permissions**: PUID/PGID for access control
**Media Storage:**
```yaml
services:
jellyfin:
volumes:
- ./config/jellyfin:/config
- jellyfin-cache:/cache
- /mnt/media:/media:ro
- /mnt/transcode:/transcode
```
* **Media**: Read-only external mount
* **Transcode**: Temporary processing space
* **Cache**: Named volume for performance
**Database Storage:**
```yaml
services:
postgres:
volumes:
- postgres-data:/var/lib/postgresql/data
environment:
- POSTGRES_DB=homelab
- POSTGRES_USER=homelab
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
```
* **Data**: Named volume for persistence
* **Backups**: Volume snapshots
* **Performance**: Proper indexing
===== Backup Storage =====
**Backrest (Primary):**
```yaml
services:
backrest:
volumes:
- ./config/backrest:/config
- /mnt/backups:/backups
- /opt/stacks:/opt/stacks:ro
- /mnt:/mnt:ro
```
* **Repository**: Local and remote storage
* **Encryption**: AES-256-GCM
* **Deduplication**: Space-efficient
* **Snapshots**: Point-in-time recovery
**Duplicati (Alternative):**
```yaml
services:
duplicati:
volumes:
- duplicati-config:/config
- duplicati-source:/source:ro
- duplicati-backup:/backup
```
* **Frontend**: Web-based interface
* **Destinations**: Multiple cloud providers
* **Encryption**: Built-in encryption
* **Scheduling**: Automated backups
===== Performance Optimization =====
**Filesystem Choice:**
* **ext4**: General purpose, reliable
* **btrfs**: Snapshots, compression, RAID
* **ZFS**: Advanced features, data integrity
* **XFS**: High performance, large files
**RAID Configuration:**
* **RAID 1**: Mirroring (2 drives)
* **RAID 5**: Striping with parity (3+ drives)
* **RAID 10**: Mirroring + striping (4+ drives)
* **RAID Z**: ZFS software RAID
**Caching Strategies:**
* **Page Cache**: OS-level caching
* **Application Cache**: Service-specific caching
* **CDN**: Content delivery networks
* **Reverse Proxy**: Traefik caching
===== Monitoring & Maintenance =====
**Storage Monitoring:**
```bash
# Disk usage
df -h
# Docker storage
docker system df
# Volume usage
docker volume ls
docker volume inspect volume-name
```
**Maintenance Tasks:**
* **Cleanup**: Remove unused volumes and images
* **Defragmentation**: Filesystem optimization
* **SMART Monitoring**: Drive health checks
* **Backup Verification**: Integrity testing
**Health Checks:**
* **Filesystem**: fsck, scrub operations
* **RAID**: Array status monitoring
* **SMART**: Drive error monitoring
* **Backup**: Restoration testing
===== Capacity Planning =====
**Storage Requirements:**
| Service | Typical Size | Growth Rate |
|---------|-------------|-------------|
| Nextcloud | 100GB+ | High (user files) |
| Jellyfin | 500GB+ | High (media library) |
| Gitea | 10GB+ | Medium (repositories) |
| Grafana | 5GB+ | Low (metrics) |
| Backups | 2x data size | Variable |
**Scaling Strategies:**
* **Vertical**: Larger drives, more RAM
* **Horizontal**: Multiple storage servers
* **Cloud**: Hybrid cloud storage
* **Archival**: Long-term retention solutions
===== Security Considerations =====
**Encryption:**
* **At Rest**: Filesystem encryption (LUKS)
* **In Transit**: TLS encryption
* **Backups**: Encrypted archives
* **Keys**: Secure key management
**Access Control:**
* **Permissions**: Proper file permissions
* **SELinux/AppArmor**: Mandatory access control
* **Network**: Isolated storage networks
* **Auditing**: Access logging
**Data Protection:**
* **RAID**: Redundancy protection
* **Snapshots**: Point-in-time copies
* **Backups**: Offsite copies
* **Testing**: Regular recovery tests
===== Disaster Recovery =====
**Recovery Strategies:**
* **File-level**: Individual file restoration
* **Volume-level**: Docker volume recovery
* **System-level**: Complete system restore
* **Bare-metal**: Full server recovery
**Business Continuity:**
* **RTO**: Recovery Time Objective
* **RPO**: Recovery Point Objective
* **Testing**: Regular DR exercises
* **Documentation**: Recovery procedures
**High Availability:**
* **Replication**: Data mirroring
* **Clustering**: Distributed storage
* **Load Balancing**: Access distribution
* **Failover**: Automatic switching
===== Migration Strategies =====
**Storage Migration:**
* **Live Migration**: Zero-downtime moves
* **Offline Migration**: Scheduled maintenance
* **Incremental**: Phased data movement
* **Verification**: Data integrity checks
**Technology Upgrades:**
* **Filesystem**: ext4 to btrfs/ZFS
* **RAID**: Hardware to software RAID
* **Storage**: Local to network storage
* **Cloud**: Hybrid cloud solutions
This storage architecture provides reliable, performant, and scalable data management for your homelab.
**Next:** Learn about [[architecture:backup|Backup Strategy]] or explore [[services:start|Service Management]].