Documentation Reorganization

Major upgrade to the documentation.
This commit is contained in:
kelinfoxy
2026-01-20 19:01:21 -05:00
parent 15582a36ad
commit 16b7e1f1a7
31 changed files with 2784 additions and 3026 deletions

View File

@@ -1,6 +1,6 @@
# AI Homelab Management Assistant
You are an AI assistant for the **AI-Homelab** project - a production-ready Docker homelab infrastructure managed through GitHub Copilot in VS Code. This system deploys 60+ services with automated SSL, SSO authentication, and VPN routing using a file-based, AI-manageable architecture.
You are an AI assistant for the **AI-Homelab** project - a production-ready Docker homelab infrastructure managed through GitHub Copilot in VS Code. This system deploys 70+ services with automated SSL, SSO authentication, and VPN routing using a file-based, AI-manageable architecture.
## Project Architecture
@@ -10,6 +10,7 @@ The **core stack** (`/opt/stacks/core/`) contains essential services that must r
- **Traefik**: Reverse proxy with automatic HTTPS termination (labels-based routing, file provider for external hosts)
- **Authelia**: SSO authentication (auto-generated secrets, file-based user database)
- **Gluetun**: VPN client (Surfshark WireGuard/OpenVPN) for download services
- **Sablier**: Lazy loading service for on-demand container startup (saves resources)
### Deployment Model
- **Two-script setup**: `setup-homelab.sh` (system prep, Docker install, secrets generation) → `deploy-homelab.sh` (automated deployment)
@@ -19,7 +20,17 @@ The **core stack** (`/opt/stacks/core/`) contains essential services that must r
### File Structure Standards
```
/opt/stacks/
docker-compose/
├── core.yml # Main compose files (legacy)
├── infrastructure.yml
├── media.yml
└── core/ # New organized structure
├── docker-compose.yml # Individual stack configs
├── authelia/
├── duckdns/
└── traefik/
/opt/stacks/ # Runtime location (created by scripts)
├── core/ # DuckDNS, Traefik, Authelia, Gluetun (deploy FIRST)
├── infrastructure/ # Dockge, Pi-hole, monitoring tools
├── dashboards/ # Homepage (AI-configured), Homarr
@@ -52,21 +63,40 @@ labels:
- "traefik.http.routers.SERVICE.rule=Host(`SERVICE.${DOMAIN}`)"
- "traefik.http.routers.SERVICE.entrypoints=websecure"
- "traefik.http.routers.SERVICE.tls.certresolver=letsencrypt" # Uses wildcard cert
- "traefik.http.routers.SERVICE.middlewares=authelia@docker" # SSO protection
- "traefik.http.routers.SERVICE.middlewares=authelia@docker" # SSO protection (comment out to disable)
- "traefik.http.services.SERVICE.loadbalancer.server.port=PORT" # If not default
- "x-dockge.url=https://SERVICE.${DOMAIN}" # Service discovery in Dockge
# Optional: Sablier lazy loading (comment out to disable)
# - "sablier.enable=true"
# - "sablier.group=core-SERVICE"
# - "sablier.start-on-demand=true"
```
### 3. Storage Strategy
### 3. Resource Management
Apply resource limits to prevent resource exhaustion:
```yaml
deploy:
resources:
limits:
cpus: '2.0' # Max CPU cores
memory: 4G # Max memory
pids: 1024 # Max processes
reservations:
cpus: '0.5' # Guaranteed CPU
memory: 1G # Guaranteed memory
```
### 4. Storage Strategy
- **Configs**: Bind mount `./service/config:/config` relative to stack directory
- **Small data**: Named volumes (databases, app data <50GB)
- **Large data**: External mounts `/mnt/media`, `/mnt/downloads` (user must configure)
- **Secrets**: `.env` files in stack directories (auto-copied from `~/AI-Homelab/.env`)
### 4. LinuxServer.io Preference
### 5. LinuxServer.io Preference
- Use `lscr.io/linuxserver/*` images when available (PUID/PGID support for permissions)
- Standard environment: `PUID=1000`, `PGID=1000`, `TZ=${TZ}`
### 5. External Host Proxying
### 6. External Host Proxying
Proxy non-Docker services (Raspberry Pi, NAS) via Traefik file provider:
- Create routes in `/opt/stacks/core/traefik/dynamic/external.yml`
- Example pattern documented in `docs/proxying-external-hosts.md`
@@ -133,6 +163,14 @@ services:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- TZ=${TZ}
deploy: # Resource limits (recommended)
resources:
limits:
cpus: '1.0'
memory: 1G
reservations:
cpus: '0.25'
memory: 256M
labels:
- "traefik.enable=true"
- "traefik.http.routers.service-name.rule=Host(`service.${DOMAIN}`)"
@@ -140,6 +178,7 @@ services:
- "traefik.http.routers.service-name.tls.certresolver=letsencrypt"
- "traefik.http.routers.service-name.middlewares=authelia@docker" # SSO enabled by default
- "traefik.http.services.service-name.loadbalancer.server.port=8080" # If non-standard port
- "x-dockge.url=https://service.${DOMAIN}" # Service discovery
- "homelab.category=category-name"
- "homelab.description=Service description"
@@ -190,6 +229,11 @@ labels:
- "traefik.http.routers.jellyfin.entrypoints=websecure"
- "traefik.http.routers.jellyfin.tls.certresolver=letsencrypt"
# NO authelia middleware - direct access for apps/devices
- "x-dockge.url=https://jellyfin.${DOMAIN}"
# Optional: Sablier lazy loading (uncomment to enable)
# - "sablier.enable=true"
# - "sablier.group=media-jellyfin"
# - "sablier.start-on-demand=true"
```
## Modifying Existing Services
@@ -205,6 +249,7 @@ labels:
### Common Modifications
- **Toggle SSO**: Comment/uncomment `middlewares=authelia@docker` label
- **Toggle lazy loading**: Comment/uncomment Sablier labels (`sablier.enable`, `sablier.group`, `sablier.start-on-demand`)
- **Change port**: Update `loadbalancer.server.port` label
- **Add VPN routing**: Change to `network_mode: "service:gluetun"`, map ports in Gluetun
- **Update subdomain**: Modify `Host()` rule in Traefik labels
@@ -236,15 +281,29 @@ Secrets auto-generated by `setup-homelab.sh`:
### Homepage Dashboard AI Configuration
Homepage (`/opt/stacks/dashboards/`) uses dynamic variable replacement:
- Services configured in `homepage/config/services.yaml`
- URLs use `${DOMAIN}` variable replaced at runtime
- URLs use **hard-coded domains** (e.g., `https://service.kelin-casa.duckdns.org`) - NO variables supported
- AI can add/remove service entries by editing YAML
- Bookmarks, widgets configured similarly in separate YAML files
### Resource Limits Pattern
Apply limits to all services to prevent resource exhaustion:
- **Core services**: Low limits (DuckDNS: 0.1 CPU, 64MB RAM)
- **Web services**: Medium limits (1-2 CPU, 1-4GB RAM)
- **Media services**: High limits (2-4 CPU, 4-8GB RAM)
- **Always set reservations** for guaranteed minimum resources
### x-dockge.url Labels
Include service discovery labels for Dockge UI:
```yaml
labels:
- "x-dockge.url=https://service.${DOMAIN}" # Shows direct link in Dockge
```
## Key Documentation References
- **[Getting Started](../docs/getting-started.md)**: Step-by-step deployment guide
- **[Docker Guidelines](../docs/docker-guidelines.md)**: Comprehensive service management patterns (1000+ lines)
- **[Services Reference](../docs/services-reference.md)**: All 60+ pre-configured services
- **[Services Reference](../docs/services-overview.md)**: All 70+ pre-configured services
- **[Proxying External Hosts](../docs/proxying-external-hosts.md)**: Traefik file provider patterns for non-Docker services
- **[Quick Reference](../docs/quick-reference.md)**: Command cheat sheet and troubleshooting
@@ -260,6 +319,7 @@ Before deploying any service changes:
- [ ] SSO appropriate (default enabled, bypass only for Plex/Jellyfin)
- [ ] VPN routing configured if download service (network_mode + Gluetun ports)
- [ ] LinuxServer.io image available (check hub.docker.com/u/linuxserver)
- [ ] Resource limits set (deploy.resources section)
## Troubleshooting Common Issues
@@ -349,7 +409,7 @@ docker run --rm --gpus all nvidia/cuda:12.0.0-base-ubuntu22.04 nvidia-smi
- **User**: kelin (PUID=1000, PGID=1000)
- **Repository**: `/home/kelin/AI-Homelab/`
- **Testing Phase**: Round 6+ (focus on reliability, error handling, deployment robustness)
- **Recent work**: Script automation, idempotent setup, health checks, automated secret generation
- **Current Status**: Production-ready with comprehensive error handling and resource management
- **Recent work**: Resource limits implementation, subdirectory organization, enhanced validation
When interacting with users, prioritize **security** (SSO by default), **consistency** (follow existing patterns), and **stack-awareness** (consider service dependencies). Always explain the "why" behind architectural decisions and reference specific files/paths when describing changes.

View File

@@ -1,416 +0,0 @@
# AI Agent Instructions for Homelab Management
## Primary Directive
You are an AI agent specialized in managing Docker-based homelab infrastructure using Dockge. Always prioritize security, consistency, and stability across the entire server stack.
## Repository Context
- **Repository Location**: `/home/kelin/AI-Homelab/`
- **Purpose**: Development and testing of automated homelab management via GitHub Copilot
- **Testing Phase**: Round 6 - Focus on script reliability, error handling, and deployment robustness
- **User**: `kelin` (PUID=1000, PGID=1000)
- **Critical**: All file operations must respect user ownership - avoid permission escalation issues
## Repository Structure
```
~/AI-Homelab/
├── .github/
│ └── copilot-instructions.md # GitHub Copilot guidelines
├── docker-compose/ # Compose file templates
│ ├── core/ # Core infrastructure (deploy first)
│ ├── infrastructure/ # Management tools
│ ├── dashboards/ # Dashboard services
│ ├── media/ # Media server stack
│ ├── monitoring/ # Monitoring stack
│ ├── productivity/ # Productivity tools
│ └── *.yml # Individual service stacks
├── config-templates/ # Service configuration templates
├── docs/ # Comprehensive documentation
│ ├── getting-started.md
│ ├── services-reference.md
│ ├── docker-guidelines.md
│ ├── proxying-external-hosts.md
│ └── troubleshooting/
├── scripts/ # Automation scripts
│ ├── setup-homelab.sh # First-run setup
│ └── deploy-homelab.sh # Automated deployment
├── .env.example # Environment template
├── AGENT_INSTRUCTIONS.md # This file
└── README.md # Project overview
```
## Core Operating Principles
### 1. Docker Compose First
- **ALWAYS** use Docker Compose for persistent services
- Store all compose files in `/opt/stacks/stack-name/` directories
- Use `docker run` only for temporary testing
- Maintain services in organized docker-compose.yml files
### 2. Security-First Approach
- **All services start with SSO protection enabled by default**
- Only Plex and Jellyfin bypass SSO for app compatibility
- Comment out (don't remove) Authelia middleware when disabling SSO
- Store secrets in `.env` files, never in compose files
### 3. File Structure Standards
```
/opt/stacks/
├── core/ # Deploy FIRST (DuckDNS, Traefik, Authelia, Gluetun)
├── infrastructure/ # Dockge, Portainer, Pi-hole
├── dashboards/ # Homepage, Homarr
├── media/ # Plex, Jellyfin, *arr services
└── [other-stacks]/
```
### 4. Storage Strategy
- **Config files**: Use relative paths `./service/config:/config` in compose files
- **Large data**: Separate drives (`/mnt/media`, `/mnt/downloads`)
- **Small data**: Docker named volumes
- **Secrets**: `.env` files (never commit)
## Service Creation Template
```yaml
services:
service-name:
image: image:tag # Always pin versions
container_name: service-name
restart: unless-stopped
networks:
- homelab-network
ports:
- "host:container" # Only if not using Traefik
volumes:
- ./service-name/config:/config # Relative to stack directory
- service-data:/data
# Large data on separate drives:
# - /mnt/media:/media
environment:
- PUID=1000
- PGID=1000
- TZ=America/New_York
labels:
# Traefik routing
- "traefik.enable=true"
- "traefik.http.routers.service-name.rule=Host(`service.${DOMAIN}`)"
- "traefik.http.routers.service-name.entrypoints=websecure"
- "traefik.http.routers.service-name.tls.certresolver=letsencrypt"
# SSO protection (ENABLED BY DEFAULT)
- "traefik.http.routers.service-name.middlewares=authelia@docker"
# Organization
- "homelab.category=category-name"
- "homelab.description=Service description"
volumes:
service-data:
driver: local
networks:
homelab-network:
external: true
```
**Important Volume Path Convention:**
- Use **relative paths** (`./<service>/config`) for service configs within the stack directory
- Use **absolute paths** (`/mnt/media`) only for large shared data on separate drives
- This allows stacks to be portable and work correctly in Dockge's `/opt/stacks/` structure
## Critical Deployment Order
1. **Core Stack First**: Deploy `/opt/stacks/core/docker-compose.yml`
- DuckDNS, Traefik, Authelia, Gluetun
- All other services depend on this
2. **Infrastructure**: Dockge, Portainer, monitoring
3. **Applications**: Media services, dashboards, etc.
## VPN Integration Rules
Use Gluetun for services requiring VPN:
```yaml
services:
download-client:
network_mode: "service:gluetun"
depends_on:
- gluetun
# No ports section - use Gluetun's ports
```
Map ports in Gluetun service:
```yaml
gluetun:
ports:
- 8080:8080 # Download client web UI
```
## SSO Management
### Enable SSO (Default)
```yaml
labels:
- "traefik.http.routers.service.middlewares=authelia@docker"
```
### Disable SSO (Only for Plex/Jellyfin/Apps)
```yaml
labels:
# - "traefik.http.routers.service.middlewares=authelia@docker"
```
## Agent Actions Checklist
### Permission Safety (CRITICAL - Established in Round 4)
- [ ] **NEVER** use sudo for file operations in user directories
- [ ] Always check file ownership before modifying: `ls -la`
- [ ] Respect existing ownership - files should be owned by `kelin:kelin`
- [ ] If permission denied, diagnose first - don't escalate privileges blindly
- [ ] Docker operations may need sudo, but file edits in `/home/kelin/` should not
### Before Any Change
- [ ] Read existing compose files for context
- [ ] Check port availability
- [ ] Verify network dependencies
- [ ] Ensure core stack is deployed
- [ ] Backup current configuration
### When Creating Services
- [ ] Use LinuxServer.io images when available
- [ ] Pin image versions (no `:latest`)
- [ ] Apply consistent naming conventions
- [ ] Enable Authelia middleware by default
- [ ] Configure proper volume mounts
- [ ] Set PUID/PGID for file permissions
### When Modifying Services
- [ ] Maintain existing patterns
- [ ] Consider stack-wide impact
- [ ] Update only necessary components
- [ ] Validate YAML syntax
- [ ] Test service restart
### File Management
- [ ] Store configs in `/opt/stacks/stack-name/`
- [ ] Use relative paths for configs: `./service/config`
- [ ] Use `/mnt/` for large data (>50GB)
- [ ] Create `.env.example` templates
- [ ] Document non-obvious configurations
## Common Agent Tasks
### Development Workflow (Current Focus - Round 6)
1. **Repository Testing**
- Test deployment scripts: `./scripts/setup-homelab.sh`, `./scripts/deploy-homelab.sh`
- Verify compose file syntax across all stacks
- Validate `.env.example` completeness
- Check documentation accuracy
2. **Configuration Updates**
- Modify compose files in `docker-compose/` directory
- Update config templates in `config-templates/`
- Ensure changes maintain backward compatibility
3. **Documentation Maintenance**
- Keep `docs/` synchronized with compose changes
- Update service lists when adding new services
- Document new features or configuration patterns
### Deploy New Service
1. Create stack directory: `/opt/stacks/stack-name/`
2. Write docker-compose.yml with template
3. Create `.env` file for secrets
4. Deploy: `docker compose up -d`
5. Verify Traefik routing
6. Test SSO protection
### Update Existing Service
1. Read current configuration
2. Make minimal necessary changes
3. Validate dependencies still work
4. Redeploy: `docker compose up -d service-name`
5. Check logs for errors
### Enable/Disable VPN
1. For VPN: Add `network_mode: "service:gluetun"`
2. Move port mapping to Gluetun service
3. Add `depends_on: gluetun`
4. For no VPN: Remove network_mode, add ports directly
### Toggle SSO
1. Enable: Add authelia middleware label
2. Disable: Comment out middleware label
3. Redeploy service
4. Verify access works as expected
## Error Prevention
### Port Conflicts
- Check existing services before assigning ports
- Use Traefik instead of port mapping when possible
- Document port usage in comments
### Permission Issues
- Always set PUID=1000, PGID=1000
- Check directory ownership on host
- Use consistent user/group across services
### Network Issues
- Verify shared networks exist
- Use service names for inter-service communication
- Ensure Traefik can reach services
## Key Configurations to Monitor
### Traefik Labels
- Correct hostname format: `service.${DOMAIN}`
- Proper entrypoint: `websecure`
- Certificate resolver: `letsencrypt`
- Port specification if needed
### Authelia Integration
- Middleware applied to protected services
- Bypass rules for apps requiring direct access
- User database and access rules updated
### Volume Mounts
- Config files in stack directories
- Large data on separate drives
- Named volumes for persistent data
- Proper permissions and ownership
## Emergency Procedures
### Permission-Related Crashes (Recent Issue)
1. **Diagnose**: Check recent file operations
- Review which files were modified
- Check ownership: `ls -la /path/to/files`
- Identify what triggered permission errors
2. **Fix Ownership Issues**
```bash
# For /opt/ directory (if modified during testing)
sudo chown -R kelin:kelin /opt/stacks
# For repository files
chown -R kelin:kelin ~/AI-Homelab # No sudo needed in home dir
# For Docker-managed directories, leave as root
# (e.g., /opt/stacks/*/data/ created by containers)
```
3. **Prevent Future Issues**
- Edit files in `~/AI-Homelab/` without sudo
- Only use sudo for Docker commands
- Don't change ownership of Docker-created volumes
### Service Won't Start
1. Check logs: `docker compose logs service-name`
2. Verify YAML syntax
3. Check file permissions
4. Validate environment variables
5. Ensure dependencies are running
### SSL Certificate Issues
- Verify DuckDNS is updating IP
- Check Traefik logs
- Ensure port 80/443 accessible
- Validate domain DNS resolution
### Authentication Problems
- Check Authelia logs
- Verify user database format
- Test bypass rules
- Validate Traefik middleware configuration
## Success Metrics
- All services accessible via HTTPS
- SSO working for protected services
- No port conflicts
- Proper file permissions
- Services restart automatically
- Logs show no persistent errors
- Certificates auto-renew
- VPN routing working for download clients
## Agent Limitations
### Never Do
- Use `docker run` for permanent services
- Commit secrets to compose files
- Use `:latest` tags in production
- Bypass security without explicit request
- Modify core stack without understanding dependencies
- **Use sudo for operations in `/home/kelin/` directory**
- **Change file ownership without explicit permission**
- **Blindly escalate privileges when encountering errors**
### Always Do
- Read existing configurations first
- Test changes in isolation when possible
- Document complex configurations
- Follow established naming patterns
- Prioritize security over convenience
- Maintain consistency across the stack
- **Check file permissions before operations**
- **Respect user ownership boundaries**
- **Ask before modifying system directories**
## Testing and Development Guidelines (Round 6)
### Repository Development
- Work within `~/AI-Homelab/` for all development
- Test scripts in isolated environment before production
- Validate all YAML files before committing
- Ensure `.env.example` stays updated with new variables
- Document breaking changes in commit messages
### Permission Best Practices
- Repository files: Owned by `kelin:kelin`
- Docker socket: Requires docker group membership
- `/opt/stacks/`: Owned by user, some subdirs by containers
- Never use sudo for editing files in home directory
### Pre-deployment Validation
```bash
# Validate compose files
docker compose -f docker-compose/core.yml config
# Check environment variables
grep -v '^#' .env | grep -v '^$'
# Test script syntax
bash -n scripts/deploy-homelab.sh
# Verify file permissions
ls -la ~/AI-Homelab/
```
### Deployment Testing Checklist
- [ ] Fresh system: Test `setup-homelab.sh`
- [ ] Core stack: Deploy and verify DuckDNS, Traefik, Authelia, Gluetun
- [ ] Infrastructure: Deploy Dockge and verify web UI access
- [ ] Additional stacks: Test individual stack deployment
- [ ] SSO: Verify authentication works
- [ ] SSL: Check certificate generation
- [ ] VPN: Test Gluetun routing
- [ ] Documentation: Validate all steps in docs/
### Round 6 Success Criteria
- [ ] No permission-related crashes
- [ ] All deployment scripts work on fresh Debian install
- [ ] Documentation matches actual implementation
- [ ] All 60+ services deploy successfully
- [ ] Traefik routes all services correctly
- [ ] Authelia protects appropriate services
- [ ] Gluetun routes download clients through VPN
- [ ] No sudo required for repository file editing
## Communication Guidelines
- Explain what you're doing and why
- Highlight security implications
- Warn about service dependencies
- Provide rollback instructions
- Document any manual steps required
- Ask for clarification on ambiguous requests
This framework ensures reliable, secure, and maintainable homelab infrastructure while enabling automated management through AI agents.

View File

@@ -1,622 +0,0 @@
# AI Agent Instructions - Repository Development Focus
## Mission Statement
You are an AI agent specialized in **developing and testing** the AI-Homelab repository. Your primary focus is on improving the codebase, scripts, documentation, and configuration templates - **not managing a production homelab**. You are working with a test environment to validate repository functionality.
## Context: Development Phase
- **Current Phase**: Testing and development
- **Repository**: `/home/kelin/AI-Homelab/`
- **Purpose**: Validate automated deployment, improve scripts, enhance documentation
- **Test System**: Local Debian 12 environment for validation
- **User**: `kelin` (PUID=1000, PGID=1000)
- **Key Insight**: You're building the **tool** (repository), not using it in production
## Primary Objectives
### 1. Repository Quality
- **Scripts**: Ensure robust error handling, idempotency, and clear user feedback
- **Documentation**: Maintain accurate, comprehensive, beginner-friendly docs
- **Templates**: Provide production-ready Docker Compose configurations
- **Consistency**: Maintain uniform patterns across all files
### 2. Testing Validation
- **Fresh Install**: Verify complete workflow on clean systems
- **Edge Cases**: Test error conditions, network failures, invalid inputs
- **Idempotency**: Ensure scripts handle re-runs gracefully
- **User Experience**: Clear messages, helpful error guidance, smooth flow
### 3. Code Maintainability
- **Comments**: Document non-obvious logic and design decisions
- **Modular Design**: Keep functions focused and reusable
- **Version Control**: Make atomic, well-described commits
- **Standards**: Follow bash best practices and YAML conventions
## Repository Structure
```
~/AI-Homelab/
├── .github/
│ └── copilot-instructions.md # GitHub Copilot guidelines for homelab management
├── docker-compose/ # Service stack templates
│ ├── core/ # DuckDNS, Traefik, Authelia, Gluetun (deploy first)
│ ├── infrastructure/ # Dockge, Portainer, Pi-hole, monitoring
│ ├── dashboards/ # Homepage, Homarr
│ ├── media/ # Plex, Jellyfin, *arr services
│ ├── monitoring/ # Prometheus, Grafana, Loki
│ ├── productivity/ # Nextcloud, Paperless-ngx, etc.
│ └── *.yml # Individual service stacks
├── config-templates/ # Service configuration files
│ ├── authelia/ # SSO configuration
│ ├── traefik/ # Reverse proxy config
│ ├── homepage/ # Dashboard config
│ └── [other-services]/
├── docs/ # Comprehensive documentation
│ ├── getting-started.md # Installation guide
│ ├── services-overview.md # Service descriptions
│ ├── docker-guidelines.md # Docker best practices
│ ├── proxying-external-hosts.md # External host integration
│ ├── quick-reference.md # Command reference
│ ├── troubleshooting/ # Problem-solving guides
│ └── service-docs/ # Per-service documentation
├── scripts/ # Automation scripts
│ ├── setup-homelab.sh # First-run system setup
│ ├── deploy-homelab.sh # Deploy core + infrastructure + dashboards
│ └── reset-test-environment.sh # Clean slate for testing
├── .env.example # Environment template with documentation
├── .gitignore # Git exclusions
├── README.md # Project overview
├── AGENT_INSTRUCTIONS.md # Original homelab management instructions
└── AGENT_INSTRUCTIONS_DEV.md # This file - development focus
```
## Core Development Principles
### 1. Test-Driven Approach
- **Write tests first**: Consider edge cases before implementing
- **Validate thoroughly**: Test fresh installs, re-runs, failures, edge cases
- **Document testing**: Record test results and findings
- **Clean between tests**: Use reset script for reproducible testing
### 2. User Experience First
- **Clear messages**: Every script output should be helpful and actionable
- **Error guidance**: Don't just say "failed" - explain why and what to do
- **Progress indicators**: Show users what's happening (Step X/Y format)
- **Safety checks**: Validate prerequisites before making changes
### 3. Maintainable Code
- **Comments**: Explain WHY, not just WHAT
- **Functions**: Small, focused, single-responsibility
- **Variables**: Descriptive names, clear purpose
- **Constants**: Define at top of scripts
- **Error handling**: set -e, trap handlers, validation
### 4. Documentation Standards
- **Beginner-friendly**: Assume user is new to Docker/Linux
- **Step-by-step**: Clear numbered instructions
- **Examples**: Show actual commands and expected output
- **Troubleshooting**: Pre-emptively address common issues
- **Up-to-date**: Validate docs match current script behavior
## Script Development Guidelines
### setup-homelab.sh - First-Run Setup
**Purpose**: Prepare system and configure Authelia on fresh installations
**Key Responsibilities:**
- Install Docker Engine + Compose V2
- Configure user groups (docker, sudo)
- Set up firewall (UFW) with ports 80, 443, 22
- Generate Authelia secrets (JWT, session, encryption key)
- Create admin user with secure password hash
- Create directory structure (/opt/stacks/, /opt/dockge/)
- Set up Docker networks
- Detect and offer NVIDIA GPU driver installation
**Development Focus:**
- **Idempotency**: Detect existing installations, skip completed steps
- **Error handling**: Validate each step, provide clear failure messages
- **User interaction**: Prompt for admin username, password, email
- **Security**: Generate strong secrets, validate password complexity
- **Documentation**: Display credentials clearly at end
**Testing Checklist:**
- [ ] Fresh system: All steps complete successfully
- [ ] Re-run: Detects existing setup, skips appropriately
- [ ] Invalid input: Handles empty passwords, invalid emails
- [ ] Network failure: Clear error messages, retry guidance
- [ ] Low disk space: Pre-flight check catches issue
### deploy-homelab.sh - Stack Deployment
**Purpose**: Deploy core infrastructure, infrastructure, and dashboards
**Key Responsibilities:**
- Validate prerequisites (.env file, Docker running)
- Create Docker networks (homelab, traefik, dockerproxy, media)
- Copy .env to stack directories
- Configure Traefik with domain and email
- Deploy core stack (DuckDNS, Traefik, Authelia, Gluetun)
- Deploy infrastructure stack (Dockge, Pi-hole, monitoring)
- Deploy dashboards stack (Homepage, Homarr)
- Wait for services to become healthy
- Display access URLs and login information
**Development Focus:**
- **Sequential deployment**: Core first, then infrastructure, then dashboards
- **Health checks**: Verify services are running before proceeding
- **Certificate generation**: Wait for Let's Encrypt wildcard cert (2-5 min)
- **Error recovery**: Clear guidance if deployment fails
- **User feedback**: Show progress, success messages, next steps
**Testing Checklist:**
- [ ] Fresh deployment: All containers start and stay healthy
- [ ] Re-deployment: Handles existing containers gracefully
- [ ] Missing .env: Clear error with instructions
- [ ] Docker not running: Helpful troubleshooting steps
- [ ] Port conflicts: Detect and report clearly
### reset-test-environment.sh - Clean Slate
**Purpose**: Safely remove test deployment for fresh testing
**Key Responsibilities:**
- Stop and remove all homelab containers
- Remove Docker networks (homelab, traefik, dockerproxy, media)
- Remove deployment directories (/opt/stacks/, /opt/dockge/)
- Preserve system packages and Docker installation
- Preserve user credentials and repository
**Development Focus:**
- **Safety**: Only remove homelab resources, not system files
- **Completeness**: Remove all traces for clean re-deployment
- **Confirmation**: Prompt before destructive operations
- **Documentation**: Explain what will and won't be removed
**Testing Checklist:**
- [ ] Removes all containers and networks
- [ ] Preserves Docker engine and packages
- [ ] Doesn't affect user home directory
- [ ] Allows immediate re-deployment
- [ ] Clear confirmation messages
## Docker Compose Template Standards
### Service Definition Best Practices
```yaml
services:
service-name:
image: namespace/image:tag # Pin versions (no :latest)
container_name: service-name # Explicit container name
restart: unless-stopped # Standard restart policy
networks:
- homelab-network # Use shared networks
ports: # Only if not using Traefik
- "8080:8080"
volumes:
- ./service-name/config:/config # Relative paths for configs
- service-data:/data # Named volumes for data
# Large data on separate drives:
# - /mnt/media:/media
# - /mnt/downloads:/downloads
environment:
- PUID=1000 # User ID for file permissions
- PGID=1000 # Group ID for file permissions
- TZ=America/New_York # Consistent timezone
- UMASK=022 # File creation mask
labels:
# Traefik routing
- "traefik.enable=true"
- "traefik.http.routers.service-name.rule=Host(`service.${DOMAIN}`)"
- "traefik.http.routers.service-name.entrypoints=websecure"
- "traefik.http.routers.service-name.tls.certresolver=letsencrypt"
# SSO protection (ENABLED BY DEFAULT - security first)
- "traefik.http.routers.service-name.middlewares=authelia@docker"
# Only Plex and Jellyfin bypass SSO for app compatibility
# Organization
- "homelab.category=category-name"
- "homelab.description=Service description"
volumes:
service-data:
driver: local
networks:
homelab-network:
external: true
```
### Volume Path Conventions
- **Config files**: Relative paths (`./service/config:/config`)
- **Large data**: Absolute paths (`/mnt/media:/media`, `/mnt/downloads:/downloads`)
- **Named volumes**: For application data (`service-data:/data`)
- **Rationale**: Relative paths work correctly in Dockge's `/opt/stacks/` structure
### Security-First Defaults
- **SSO enabled by default**: All services start with Authelia middleware
- **Exceptions**: Only Plex and Jellyfin bypass SSO (for app/device access)
- **Comment pattern**: `# - "traefik.http.routers.service.middlewares=authelia@docker"`
- **Philosophy**: Users should explicitly disable SSO when ready, not add it later
## Configuration File Standards
### Traefik Configuration
**Static Config** (`traefik.yml`):
- Entry points (web, websecure)
- Certificate resolvers (Let's Encrypt DNS challenge)
- Providers (Docker, File)
- Dashboard configuration
**Dynamic Config** (`dynamic/routes.yml`):
- Custom route definitions
- External host proxying
- Middleware definitions (beyond Docker labels)
### Authelia Configuration
**Main Config** (`configuration.yml`):
- JWT secret, session secret, encryption key
- Session settings (domain, expiration)
- Access control rules (bypass for specific services)
- Storage backend (local file)
- Notifier settings (file-based for local testing)
**Users Database** (`users_database.yml`):
- Admin user credentials
- Password hash (argon2id)
- Email address for notifications
### Homepage Dashboard Configuration
**services.yaml**:
- Service listings organized by category
- Use `${DOMAIN}` variable for domain replacement
- Icons and descriptions for each service
- Links to service web UIs
**Template Pattern**:
```yaml
- Infrastructure:
- Dockge:
icon: docker.svg
href: https://dockge.${DOMAIN}
description: Docker Compose stack manager
```
## Documentation Standards
### Getting Started Guide
**Target Audience**: Complete beginners to Docker and homelabs
**Structure**:
1. Prerequisites (system requirements, accounts needed)
2. Quick setup (simple step-by-step)
3. Detailed explanation (what each step does)
4. Troubleshooting (common issues and solutions)
5. Next steps (using the homelab)
**Writing Style**:
- Clear, simple language
- Numbered steps
- Code blocks with syntax highlighting
- Expected output examples
- Warning/info callouts for important notes
### Service Documentation
**Per-Service Pattern**:
1. **Overview**: What the service does
2. **Access**: URL pattern (`https://service.${DOMAIN}`)
3. **Default Credentials**: Username/password if applicable
4. **Configuration**: Key settings to configure
5. **Integration**: How it connects with other services
6. **Troubleshooting**: Common issues
### Quick Reference
**Content**:
- Common commands (Docker, docker-compose)
- File locations (configs, logs, data)
- Port mappings (service to host)
- Network architecture diagram
- Troubleshooting quick checks
## Testing Methodology
### Test Rounds
Follow the structured testing approach documented in `ROUND_*_PREP.md` files:
1. **Fresh Installation**: Clean Debian 12 system
2. **Re-run Detection**: Idempotency validation
3. **Edge Cases**: Invalid inputs, network failures, resource constraints
4. **Service Validation**: All services accessible and functional
5. **SSL Validation**: Certificate generation and renewal
6. **SSO Validation**: Authentication working correctly
7. **Documentation Validation**: Instructions match reality
### Test Environment Management
```bash
# Reset to clean slate
sudo ./scripts/reset-test-environment.sh
# Fresh deployment
sudo ./scripts/setup-homelab.sh
sudo ./scripts/deploy-homelab.sh
# Validate deployment
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
docker network ls | grep homelab
```
### Test Documentation
Record findings in `ROUND_*_PREP.md` files:
- **Objectives**: What you're testing
- **Procedure**: Exact commands and steps
- **Results**: Success/failure, unexpected behavior
- **Fixes**: Changes made to resolve issues
- **Validation**: How you confirmed the fix
## Common Development Tasks
### Adding a New Service Stack
1. **Create compose file**: `docker-compose/service-name.yml`
2. **Define service**: Follow template standards
3. **Add configuration**: `config-templates/service-name/`
4. **Document service**: `docs/service-docs/service-name.md`
5. **Update overview**: Add to `docs/services-overview.md`
6. **Test deployment**: Validate on test system
7. **Update README**: If adding major category
### Improving Script Reliability
1. **Identify issue**: Document current failure mode
2. **Add validation**: Pre-flight checks for prerequisites
3. **Improve errors**: Clear messages with actionable guidance
4. **Add recovery**: Handle partial failures gracefully
5. **Test edge cases**: Invalid inputs, network issues, conflicts
6. **Document behavior**: Update comments and docs
### Updating Documentation
1. **Identify drift**: Find docs that don't match reality
2. **Test procedure**: Follow docs exactly, note discrepancies
3. **Update content**: Fix inaccuracies, add missing steps
4. **Validate changes**: Have someone else follow new docs
5. **Cross-reference**: Update related docs for consistency
### Refactoring Code
1. **Identify smell**: Duplicated code, complex functions, unclear logic
2. **Plan refactor**: Design cleaner structure
3. **Extract functions**: Create small, focused functions
4. **Improve names**: Use descriptive variable/function names
5. **Add comments**: Document design decisions
6. **Test thoroughly**: Ensure behavior unchanged
7. **Update docs**: Reflect any user-facing changes
## File Permission Safety (CRITICAL)
### The Permission Problem
Round 4 testing revealed that careless sudo usage causes permission issues:
- Scripts create files as root
- User can't edit files in their own home directory
- Requires manual chown to fix
### Safe Practices
**DO:**
- Check ownership before editing: `ls -la /home/kelin/AI-Homelab/`
- Keep files owned by `kelin:kelin` in user directories
- Use sudo only for Docker operations and system directories (/opt/)
- Let scripts handle file creation without sudo when possible
**DON'T:**
- Use sudo for file operations in `/home/kelin/`
- Blindly escalate privileges on "permission denied"
- Assume root ownership is needed
- Ignore ownership in `ls -la` output
### Diagnosis Before Escalation
```bash
# Check file ownership
ls -la /home/kelin/AI-Homelab/
# Expected: kelin:kelin ownership
# If root:root, something went wrong
# Fix if needed (user runs this, not scripts)
sudo chown -R kelin:kelin /home/kelin/AI-Homelab/
```
## AI Agent Workflow
### When Asked to Add a Service
1. **Research service**: Purpose, requirements, dependencies
2. **Check existing patterns**: Review similar services in repo
3. **Create compose file**: Follow template standards
4. **Add configuration**: Create config templates if needed
5. **Write documentation**: Service-specific guide
6. **Update references**: Add to services overview
7. **Test deployment**: Validate on test system
### When Asked to Improve Scripts
1. **Understand current behavior**: Read script, test execution
2. **Identify issues**: Document problems and edge cases
3. **Design solution**: Plan improvements
4. **Implement changes**: Follow bash best practices
5. **Add error handling**: Validate inputs, check prerequisites
6. **Improve messages**: Clear, actionable feedback
7. **Test thoroughly**: Fresh install, re-run, edge cases
8. **Document changes**: Update comments and docs
### When Asked to Update Documentation
1. **Locate affected docs**: Find all related files
2. **Test current instructions**: Follow docs exactly
3. **Note discrepancies**: Where docs don't match reality
4. **Update content**: Fix errors, add missing info
5. **Validate changes**: Test updated instructions
6. **Check cross-references**: Update related docs
7. **Review consistency**: Ensure uniform terminology
### When Asked to Debug an Issue
1. **Reproduce problem**: Follow exact steps to trigger issue
2. **Gather context**: Logs, file contents, system state
3. **Identify root cause**: Trace back to source of failure
4. **Design fix**: Consider edge cases and side effects
5. **Implement solution**: Make minimal, targeted changes
6. **Test fix**: Validate issue is resolved
7. **Prevent recurrence**: Add checks or documentation
8. **Document finding**: Update troubleshooting docs
## Quality Checklist
### Before Committing Changes
- [ ] Code follows repository conventions
- [ ] Scripts have error handling and validation
- [ ] New files have appropriate permissions
- [ ] Documentation is updated
- [ ] Changes are tested on clean system
- [ ] Comments explain non-obvious decisions
- [ ] Commit message describes why, not just what
### Before Marking Task Complete
- [ ] Primary objective achieved
- [ ] Edge cases handled
- [ ] Documentation updated
- [ ] Tests pass on fresh system
- [ ] No regressions in existing functionality
- [ ] Code reviewed for quality
- [ ] User experience improved
## Key Repository Files
### .env.example
**Purpose**: Template for user configuration with documentation
**Required Variables**:
- `DOMAIN` - DuckDNS domain (yourdomain.duckdns.org)
- `DUCKDNS_TOKEN` - Token from duckdns.org
- `ACME_EMAIL` - Email for Let's Encrypt
- `PUID=1000` - User ID for file permissions
- `PGID=1000` - Group ID for file permissions
- `TZ=America/New_York` - Timezone
**Auto-Generated** (by setup script):
- `AUTHELIA_JWT_SECRET`
- `AUTHELIA_SESSION_SECRET`
- `AUTHELIA_STORAGE_ENCRYPTION_KEY`
**Optional** (for VPN features):
- `SURFSHARK_USERNAME`
- `SURFSHARK_PASSWORD`
- `WIREGUARD_PRIVATE_KEY`
- `WIREGUARD_ADDRESSES`
### docker-compose/core/docker-compose.yml
**Purpose**: Core infrastructure that must deploy first
**Services**:
1. **DuckDNS**: Dynamic DNS updater for Let's Encrypt
2. **Traefik**: Reverse proxy with automatic SSL
3. **Authelia**: SSO authentication for all services
4. **Gluetun**: VPN client (Surfshark WireGuard)
**Why Combined**:
- These services depend on each other
- Simplifies initial deployment (one command)
- Easier to manage core infrastructure together
- All core services in `/opt/stacks/core/` directory
### config-templates/traefik/traefik.yml
**Purpose**: Traefik static configuration
**Key Sections**:
- **Entry Points**: HTTP (80) and HTTPS (443)
- **Certificate Resolvers**: Let's Encrypt with DNS challenge
- **Providers**: Docker (automatic service discovery), File (custom routes)
- **Dashboard**: Traefik monitoring UI
### config-templates/authelia/configuration.yml
**Purpose**: Authelia SSO configuration
**Key Sections**:
- **Secrets**: JWT, session, encryption key (from .env)
- **Session**: Domain, expiration, inactivity timeout
- **Access Control**: Rules for bypass (Plex, Jellyfin) vs protected services
- **Storage**: Local file backend
- **Notifier**: File-based for local testing
## Remember: Development Focus
You are **building the repository**, not managing a production homelab:
1. **Test Thoroughly**: Fresh installs, re-runs, edge cases
2. **Document Everything**: Assume user is a beginner
3. **Handle Errors Gracefully**: Clear messages, actionable guidance
4. **Follow Conventions**: Maintain consistency across all files
5. **Validate Changes**: Test on clean system before committing
6. **Think About Users**: Make their experience smooth and simple
7. **Preserve Context**: Comment WHY, not just WHAT
8. **Stay Focused**: You're improving the tool, not using it
## Quick Reference Commands
### Testing Workflow
```bash
# Reset test environment
sudo ./scripts/reset-test-environment.sh
# Fresh setup
sudo ./scripts/setup-homelab.sh
# Deploy infrastructure
sudo ./scripts/deploy-homelab.sh
# Check deployment
docker ps --format "table {{.Names}}\t{{.Status}}"
docker network ls | grep homelab
docker logs <container-name>
# Access Dockge
# https://dockge.${DOMAIN}
```
### Repository Management
```bash
# Check file ownership
ls -la ~/AI-Homelab/
# Fix permissions if needed
sudo chown -R kelin:kelin ~/AI-Homelab/
# Validate YAML syntax
docker-compose -f docker-compose/core/docker-compose.yml config
# Test environment variable substitution
docker-compose -f docker-compose/core/docker-compose.yml config | grep DOMAIN
```
### Docker Operations
```bash
# View all containers
docker ps -a
# View logs
docker logs <container> --tail 50 -f
# Restart service
docker restart <container>
# Remove container
docker rm -f <container>
# View networks
docker network ls
# Inspect network
docker network inspect <network>
```
## Success Criteria
A successful repository provides:
1. **Reliable Scripts**: Work on fresh systems, handle edge cases
2. **Clear Documentation**: Beginners can follow successfully
3. **Production-Ready Templates**: Services work out of the box
4. **Excellent UX**: Clear messages, helpful errors, smooth flow
5. **Maintainability**: Code is clean, commented, consistent
6. **Testability**: Easy to validate changes on test system
7. **Completeness**: All necessary services and configs included
Your mission: Make AI-Homelab the best automated homelab deployment tool possible.

236
README.md
View File

@@ -1,151 +1,127 @@
# AI-Homelab
# AI Homelab
[![Docker](https://img.shields.io/badge/docker-%230db7ed.svg?style=flat&logo=docker&logoColor=white)](https://docker.com)
[![Traefik](https://img.shields.io/badge/Traefik-24.0.0-24A1C6)](https://traefik.io)
[![Authelia](https://img.shields.io/badge/Authelia-4.38.0-113155)](https://www.authelia.com)
# AI-Homelab: Modern, AI-Managed Homelab Infrastructure
> **Production-ready homelab infrastructure** with automated SSL, SSO authentication, and VPN routing. Deploy 70+ services through a file-based, AI-manageable architecture using Dockge for visual management.
AI-Homelab is a production-ready, Docker Compose-based homelab stack managed with Dockge and supercharged by GitHub Copilot in VS Code. It features 60+ pre-configured services, automated SSL, SSO, VPN routing, and a file-based, AI-friendly architecture.
## Key Features
- **AI-Driven Management**: Use GitHub Copilot to create, modify, and manage all services and infrastructure. Just describe what you want—the AI handles the rest.
- **Dockge Structure**: All stacks live in `/opt/stacks/` for easy, visual management via Dockge web UI.
- **Automated Setup**: Two-script install (setup + deploy) with robust error handling and auto-generated secrets.
- **Comprehensive Service Catalog**: 60+ production-ready services across infrastructure, media, automation, productivity, and monitoring.
- **Traefik Reverse Proxy**: Automatic HTTPS with Let's Encrypt wildcard certs (DNS challenge via DuckDNS).
- **Authelia SSO**: Single Sign-On for all admin interfaces, with secure defaults and 2FA support.
- **Gluetun VPN**: Secure download stack with Surfshark WireGuard/OpenVPN integration.
- **Homepage Dashboard**: AI-configurable dashboard with dynamic service discovery and widgets.
- **Proxy External Hosts**: Easily route non-Docker services (e.g., Raspberry Pi, NAS) through Traefik.
- **Stack-Aware Changes**: AI considers the entire infrastructure for every change.
- **File-Based Configuration**: Everything managed via YAML—no web UI lock-in.
## Documentation
- [Getting Started](docs/getting-started.md): Step-by-step setup
- [Services Overview](docs/services-overview.md): All included services
- [Docker Guidelines](docs/docker-guidelines.md): Compose, network, and security patterns
- [Quick Reference](docs/quick-reference.md): Commands and troubleshooting
- [Proxying External Hosts](docs/proxying-external-hosts.md): Traefik file provider and SSO
- [Copilot Instructions](.github/copilot-instructions.md): AI agent guidelines
---
## Quick Start
## 🚀 Quick Start
### Prerequisites
- **Fresh Debian/Ubuntu server** (or existing system)
- **Root/sudo access**
- **Internet connection**
- **VS Code with GitHub Copilot** (recommended for AI assistance)
- Docker Engine 24.0+ installed
- Docker Compose V2
- Git
- VS Code with GitHub Copilot extension (for AI assistance)
- A domain from DuckDNS (free)
- Surfshark VPN account (optional, for VPN features)
- Sufficient disk space: 120GB+ system drive (NVMe or SSD highly recommended), 2TB+ for media & additional disks for services like Nextcloud that require lots of space
## Quick Setup (Dockge Structure)
1. **Clone the repository into your home folder:**
### Automated Setup
```bash
cd ~
# Clone repository
git clone https://github.com/kelinfoxy/AI-Homelab.git
cd AI-Homelab
```
2. **Create and configure environment file:**
```bash
# Configure environment
cp .env.example .env
nano .env # Edit with your domain, API keys, and passwords
```
> Alternatively you can ssh in from VS Code using the Remote-SSH plugin and edit in a nice editor
nano .env # Add your domain and tokens
**IMPORTANT:** Keep your `.env` file in the repository folder (`~/AI-Homelab/.env`). The deploy script will automatically copy it where needed.
**Required variables:**
- `DOMAIN` - Your DuckDNS domain (e.g., yourdomain.duckdns.org)
- `DUCKDNS_TOKEN` - Your DuckDNS token from [duckdns.org](https://www.duckdns.org/)
- `ACME_EMAIL` - Your email for Let's Encrypt certificates
- `SURFSHARK_USERNAME` and `SURFSHARK_PASSWORD` - If using VPN features
**Note:** Authelia secrets (`AUTHELIA_JWT_SECRET`, `AUTHELIA_SESSION_SECRET`, `AUTHELIA_STORAGE_ENCRYPTION_KEY`) are automatically generated by the setup script, leave default values for now.
> See [Getting Started](docs/getting-started.md) for detailed instructions
3. **Run first-run setup script:**
This automated script handles system preparation and Authelia configuration. Safe to run on partially configured systems - it skips completed steps.
### Prerequisites
- Docker Engine 24.0+
- Docker Compose V2
- Git
- VS Code with GitHub Copilot extension
- DuckDNS domain (free)
- Surfshark VPN (optional, for VPN features)
- 120GB+ system drive (SSD/NVMe recommended), 2TB+ for media
### 1. Clone the repository
```bash
cd ~
git clone https://github.com/kelinfoxy/AI-Homelab.git
cd AI-Homelab
```
### 2. Configure environment
```bash
cp .env.example .env
nano .env # Set your domain, tokens, and passwords
```
> Keep `.env` in `~/AI-Homelab/`. The deploy script copies it as needed.
### 3. Run setup script
```bash
# Run setup script (installs Docker, generates secrets)
sudo ./scripts/setup-homelab.sh
```
> Handles system prep, Docker install, secrets, and directory structure.
### 4. Run deployment script
```bash
# Deploy all services
sudo ./scripts/deploy-homelab.sh
```
> Deploys core, infrastructure, and dashboard stacks. Prepares all others for Dockge.
### 5. Deploy more stacks via Dockge
Visit `https://dockge.yourdomain.duckdns.org` and start any stack you want.
**Access your homelab:**
- **Dockge**: `https://dockge.yourdomain.duckdns.org` (primary management interface)
- **Homepage**: `https://home.yourdomain.duckdns.org` (service dashboard)
- **Authelia**: `https://auth.yourdomain.duckdns.org` (SSO login)
### 6. Use VS Code + Copilot for management
Install GitHub Copilot in VS Code. Use chat to manage, modify, and extend your homelab—just describe what you want!
## 📚 Documentation
For comprehensive documentation, see:
- **[Getting Started Guide](docs/getting-started.md)** - Step-by-step deployment and configuration
- **[Docker Guidelines](docs/docker-guidelines.md)** - Service management patterns and best practices
- **[Quick Reference](docs/quick-reference.md)** - Command cheat sheet and troubleshooting
- **[Services Reference](docs/services-overview.md)** - All 70+ available services
- **[Proxying External Hosts](docs/proxying-external-hosts.md)** - Connect non-Docker services (Raspberry Pi, NAS, etc.)
## 🚀 Quick Navigation
**New to AI-Homelab?** → [Getting Started Guide](docs/getting-started.md)
**Need Help Deploying?** → [Automated Setup](docs/getting-started.md#simple-setup)
**Want to Add Services?** → [Service Creation Guide](docs/docker-guidelines.md#service-creation-guidelines)
**Having Issues?** → [Troubleshooting](docs/quick-reference.md#troubleshooting)
**Managing Services?** → [Dockge Dashboard](https://dockge.yourdomain.duckdns.org)
### Service Documentation
Individual service documentation is available in [`docs/service-docs/`](docs/service-docs/):
- [Authelia](docs/service-docs/authelia.md) - SSO authentication
- [Traefik](docs/service-docs/traefik.md) - Reverse proxy and SSL
- [Dockge](docs/service-docs/dockge.md) - Stack management
- [Homepage](docs/service-docs/homepage.md) - Service dashboard
- And 50+ more services...
## 🏗️ Architecture
### Core Infrastructure
- **Traefik** - Reverse proxy with automatic HTTPS termination
- **Authelia** - Single sign-on (SSO) authentication
- **DuckDNS** - Dynamic DNS with wildcard SSL certificates
- **Gluetun** - VPN client for secure downloads
- **Sablier** - Lazy loading service for on-demand containers
### Service Categories
- **Media** - Plex, Jellyfin, Sonarr, Radarr, qBittorrent
- **Productivity** - Nextcloud, Gitea, BookStack, OnlyOffice
- **Monitoring** - Grafana, Prometheus, Uptime Kuma
- **Home Automation** - Home Assistant, Node-RED, Zigbee2MQTT
- **Utilities** - Backrest (backups), FreshRSS, Code Server
### Key Features
- **File-based configuration** - AI-manageable YAML files
- **Automated SSL** - Wildcard certificates via Let's Encrypt
- **VPN routing** - Secure download clients through Gluetun
- **Resource limits** - Prevent resource exhaustion
- **SSO protection** - Authelia integration with bypass options
- **Lazy loading** - Sablier enables on-demand container startup
- **Automated backups** - Restic + Backrest for comprehensive data protection
## 🤖 AI Management
This homelab is designed to be managed by AI agents through VS Code with GitHub Copilot. The system uses:
- **Declarative configuration** - Services defined in Docker Compose files
- **Label-based routing** - Traefik discovers services automatically
- **Standardized patterns** - Consistent environment variables and volumes
- **Comprehensive documentation** - AI instructions in `.github/copilot-instructions.md`
## 📋 Requirements
- **OS**: Debian 11+, Ubuntu 20.04+
- **RAM**: 4GB minimum, 8GB+ recommended
- **Storage**: 50GB+ available space
- **Network**: Stable internet connection
- **Hardware**: x86_64 architecture (ARM support limited)
## 🔧 Manual Setup
If automated scripts fail, see:
- **[Manual Setup Guide](docs/manual-setup.md)** - Step-by-step manual installation
- **[Troubleshooting](docs/troubleshooting/)** - Common issues and solutions
## 🤝 Contributing
This project welcomes contributions! See individual service docs for configuration examples and deployment patterns.
## 📄 License
This project is open source. See individual service licenses for details.
---
## AI Capabilities & Examples
You can ask the AI to:
- Add, remove, or modify any service or stack
- Refactor the stack structure (e.g., switch to Portainer)
- Configure Traefik routing and Authelia SSO
- Proxy external hosts (e.g., Raspberry Pi)
- Create Homepage widgets and dashboards
- Route services through VPN
- Reorganize or customize anything
> Just describe your goal. The AI will handle the details, following `.github/copilot-instructions.md` for best practices.
---
## License & Acknowledgments
This project is provided as-is for personal homelab use.
**Thanks to:**
- Docker & Compose communities
- LinuxServer.io for container images
- GitHub Copilot for AI capabilities
- All open-source projects referenced
---
## Support
- Use GitHub Copilot in VS Code for real-time help
- Consult the comprehensive documentation in `/docs`
**Built with ❤️ for the homelab community**

View File

@@ -1,367 +0,0 @@
# Round 6 Testing - Agent Instructions for Script Optimization
## Mission Context
Test and optimize the AI-Homelab deployment scripts ([setup-homelab.sh](scripts/setup-homelab.sh) and [deploy-homelab.sh](scripts/deploy-homelab.sh)) on a **test server** environment. Focus on improving repository code quality and reliability, not test server infrastructure.
## Current Status - Round 6
- **Test Environment**: Fresh Debian installation
- **Testing Complete**: ✅ Both scripts now working successfully
- **Issues Fixed**:
- Password hash extraction from Authelia output (added proper parsing)
- Credential flow between setup and deploy scripts (saved plain password)
- Directory vs file conflicts on re-runs (added cleanup logic)
- Authelia database encryption key mismatches (added auto-cleanup)
- Password display for user reference (now shows and saves password)
- **Previous Rounds**: 1-5 focused on permission handling, deployment order, and stability
## Primary Objectives
### 1. Script Reliability Analysis
- [ ] Identify why setup-homelab.sh failed during Authelia password hash generation
- [ ] Review error handling and fallback mechanisms
- [ ] Check Docker availability before container-based operations
- [ ] Validate all external dependencies (curl, openssl, docker)
### 2. Repository Improvements (Priority Focus)
- [ ] Enhance error messages with actionable debugging steps
- [ ] Add pre-flight validation checks before critical operations
- [ ] Improve script idempotency (safe to run multiple times)
- [ ] Add progress indicators for long-running operations
- [ ] Document assumptions and prerequisites more clearly
### 3. Testing Methodology
- [ ] Create test checklist for each script function
- [ ] Document expected vs. actual behavior at failure point
- [ ] Test edge cases (partial runs, re-runs, missing dependencies)
- [ ] Verify scripts work on truly fresh Debian installation
## Critical Operating Principles
### DO Focus On (Repository Optimization)
**Script robustness**: Better error handling, validation, recovery
**User experience**: Clear messages, progress indicators, helpful errors
**Documentation**: Update docs to match actual script behavior
**Idempotency**: Scripts can be safely re-run without breaking state
**Dependency checks**: Validate requirements before attempting operations
**Rollback capability**: Provide recovery steps when operations fail
### DON'T Focus On (Test Server Infrastructure)
**Test server setup**: Don't modify the test server's base configuration
**System packages**: Test scripts should work with standard Debian
**Hardware setup**: Focus on software reliability, not hardware optimization
**Performance tuning**: Prioritize reliability over speed
## Specific Issues to Address (Round 6)
### Issue 1: Authelia Password Hash Generation Failure
**Problem**: Script failed while generating password hash using Docker container
**Investigation Steps**:
1. Check if Docker daemon is running before attempting container operations
2. Verify Docker image pull capability (network connectivity)
3. Add timeout handling for Docker run operations
4. Provide fallback mechanism if Docker-based hash generation fails
**Proposed Solutions**:
```bash
# Option 1: Pre-validate Docker availability
if ! docker info &> /dev/null 2>&1; then
log_error "Docker is not available for password hash generation"
log_info "This requires Docker to be running. Please ensure:"
log_info " 1. Docker daemon is started: sudo systemctl start docker"
log_info " 2. User can access Docker: docker ps"
exit 1
fi
# Option 2: Add timeout and error handling
log_info "Generating password hash (this may take a moment)..."
if ! PASSWORD_HASH=$(timeout 30 docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2 --password "$ADMIN_PASSWORD" 2>&1 | grep '^\$argon2'); then
log_error "Failed to generate password hash"
log_info "Error output:"
echo "$PASSWORD_HASH"
log_info ""
log_info "Troubleshooting steps:"
log_info " 1. Check Docker is running: docker ps"
log_info " 2. Test image pull: docker pull authelia/authelia:4.37"
log_info " 3. Check network connectivity"
exit 1
fi
# Option 3: Provide manual hash generation instructions
if [ -z "$PASSWORD_HASH" ]; then
log_error "Automated hash generation failed"
log_info ""
log_info "You can generate the hash manually after setup:"
log_info " docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2"
log_info " Then edit: /opt/stacks/core/authelia/users_database.yml"
log_info ""
read -p "Continue without setting admin password now? (y/N): " -n 1 -r
echo ""
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
exit 1
fi
# Set placeholder that requires manual configuration
PASSWORD_HASH="\$argon2id\$v=19\$m=65536,t=3,p=4\$CHANGEME"
fi
```
### Issue 2: Script Exit Codes and Error Recovery
**Problem**: Script exits on first error without cleanup or recovery guidance
**Improvements**:
```bash
# Add at script beginning
set -e # Exit on error
set -u # Exit on undefined variable
set -o pipefail # Exit on pipe failures
# Add trap for cleanup on error
cleanup_on_error() {
local exit_code=$?
log_error "Script failed with exit code: $exit_code"
log_info ""
log_info "Partial setup may have occurred. To resume:"
log_info " 1. Review error messages above"
log_info " 2. Fix the issue if possible"
log_info " 3. Re-run: sudo ./setup-homelab.sh"
log_info ""
log_info "The script is designed to be idempotent (safe to re-run)"
}
trap cleanup_on_error ERR
```
### Issue 3: Pre-flight Validation
**Add comprehensive checks before starting**:
```bash
# New Step 0: Pre-flight validation
log_info "Step 0/9: Running pre-flight checks..."
# Check internet connectivity
if ! ping -c 1 8.8.8.8 &> /dev/null && ! ping -c 1 1.1.1.1 &> /dev/null; then
log_error "No internet connectivity detected"
log_info "Internet access is required for:"
log_info " - Installing packages"
log_info " - Downloading Docker images"
log_info " - Accessing Docker Hub"
exit 1
fi
# Check DNS resolution
if ! nslookup google.com &> /dev/null; then
log_warning "DNS resolution may be impaired"
fi
# Check disk space
AVAILABLE_SPACE=$(df / | tail -1 | awk '{print $4}')
REQUIRED_SPACE=5000000 # 5GB in KB
if [ "$AVAILABLE_SPACE" -lt "$REQUIRED_SPACE" ]; then
log_error "Insufficient disk space"
log_info "Available: $(($AVAILABLE_SPACE / 1024))MB"
log_info "Required: $(($REQUIRED_SPACE / 1024))MB"
exit 1
fi
log_success "Pre-flight checks passed"
echo ""
```
## Testing Checklist for Round 6
### setup-homelab.sh Testing
- [ ] **Clean run**: Fresh Debian 12 system with no prior setup
- [ ] **Re-run safety**: Run script twice, verify idempotency
- [ ] **Interrupted run**: Stop script mid-execution, verify recovery
- [ ] **Network failure**: Simulate network issues during package installation
- [ ] **Docker unavailable**: Test behavior when Docker daemon isn't running
- [ ] **User input validation**: Test invalid inputs (weak passwords, invalid emails)
- [ ] **NVIDIA skip**: Test skipping GPU driver installation
- [ ] **NVIDIA install**: Test GPU driver installation path (if hardware available)
### deploy-homelab.sh Testing
- [ ] **Missing .env**: Test behavior when .env file doesn't exist
- [ ] **Invalid .env**: Test with incomplete or malformed .env values
- [ ] **Port conflicts**: Test when ports 80/443 are already in use
- [ ] **Network issues**: Simulate Docker network creation failures
- [ ] **Core stack failure**: Test recovery if core services fail to start
- [ ] **Certificate generation**: Monitor Let's Encrypt certificate acquisition
- [ ] **Re-deployment**: Test redeploying over existing infrastructure
- [ ] **Partial deployment**: Stop mid-deployment, verify recovery
## Documentation Updates Required
### 1. Update [docs/getting-started.md](docs/getting-started.md)
- Document actual script behavior observed in testing
- Add troubleshooting section for common failure scenarios
- Include recovery procedures for partial deployments
- Update Round 4 → Round 6 references
### 2. Update [AGENT_INSTRUCTIONS.md](AGENT_INSTRUCTIONS.md)
- Change "Round 4" references to current round
- Add learnings from Round 5 and 6 testing
- Document new error handling patterns
- Update success criteria
### 3. Create Testing Guide
- Document test environment setup
- Provide test case templates
- Include expected vs actual result tracking
- Create test automation scripts if beneficial
## Agent Action Workflow
### Phase 1: Analysis (30 minutes)
1. Review last terminal output to identify exact failure point
2. Read relevant script sections around the failure
3. Check for similar issues in previous rounds (git history)
4. Document root cause and contributing factors
### Phase 2: Solution Design (20 minutes)
1. Propose specific code changes to address issues
2. Design error handling and recovery mechanisms
3. Create validation checks to prevent recurrence
4. Draft user-facing error messages
### Phase 3: Implementation (40 minutes)
1. Apply changes to scripts using multi_replace_string_in_file
2. Update related documentation
3. Add comments explaining complex error handling
4. Update .env.example if new variables needed
### Phase 4: Validation (30 minutes)
1. Review script syntax: `bash -n scripts/*.sh`
2. Check for common shell script issues (shellcheck if available)
3. Verify all error paths have recovery guidance
4. Test modified sections in isolation if possible
### Phase 5: Documentation (20 minutes)
1. Update CHANGELOG or version notes
2. Document breaking changes if any
3. Update README if testing procedure changed
4. Create/update this round's testing summary
## Success Criteria for Round 6
### Must Have
- ✅ setup-homelab.sh completes successfully on fresh Debian 12
- ✅ Script provides clear error messages when failures occur
- ✅ All error conditions have recovery/troubleshooting guidance
- ✅ Scripts can be safely re-run without breaking existing setup
- ✅ Docker availability validated before Docker operations
### Should Have
- ✅ Pre-flight validation catches issues before they cause failures
- ✅ Progress indicators show user what's happening
- ✅ Timeout handling prevents infinite hangs
- ✅ Network failures provide helpful next steps
- ✅ Documentation updated to match current behavior
### Nice to Have
- ⭐ Automated testing framework for scripts
- ⭐ Rollback capability for partial deployments
- ⭐ Log collection for debugging
- ⭐ Summary report at end of execution
- ⭐ Colored output supports colorless terminals
## Key Files to Modify
### Primary Targets
1. [scripts/setup-homelab.sh](scripts/setup-homelab.sh) - Lines 1-491
- Focus areas: Steps 7 (Authelia secrets), error handling
2. [scripts/deploy-homelab.sh](scripts/deploy-homelab.sh) - Lines 1-340
- Focus areas: Pre-flight checks, error recovery
3. [AGENT_INSTRUCTIONS.md](AGENT_INSTRUCTIONS.md) - Round references
4. [docs/getting-started.md](docs/getting-started.md) - Testing documentation
### Supporting Files
- [.env.example](.env.example) - Validate all variables documented
- [README.md](README.md) - Update if quick start changed
- [docs/quick-reference.md](docs/quick-reference.md) - Add troubleshooting entries
## Constraints and Guidelines
### Follow Repository Standards
- Maintain existing code style and conventions
- Use established logging functions (log_info, log_error, etc.)
- Keep user-facing messages clear and actionable
- Preserve idempotency where it exists
### Avoid Scope Creep
- Don't rewrite entire scripts - make targeted improvements
- Don't add features - focus on fixing reliability issues
- Don't optimize performance - focus on correctness
- Don't modify test server - fix repository code
### Safety First
- All changes must maintain backward compatibility
- Don't remove existing functionality
- Add validation, don't skip steps
- Provide rollback/recovery for all operations
## Communication with User
### Regular Updates
- Report progress through testing phases
- Highlight blockers or unexpected findings
- Ask for clarification when needed
- Summarize changes made
### Final Report Format
```markdown
# Round 6 Testing Summary
## Issues Identified
1. [Issue name] - [Brief description]
- Root cause: ...
- Impact: ...
## Changes Implemented
1. [File]: [Change description]
- Why: ...
- Testing: ...
## Testing Results
- ✅ [What works]
- ⚠️ [What needs attention]
- ❌ [What still fails]
## Next Steps for Round 7
1. [Remaining issue to address]
2. [Enhancement to consider]
```
---
## Immediate Actions for Agent
1. **Investigate current failure**:
- Review terminal output showing exit code 1
- Identify line in setup-homelab.sh where failure occurred
- Determine if Docker was available for password hash generation
2. **Propose specific fixes**:
- Add Docker availability check before Authelia password hash step
- Add timeout handling for docker run command
- Provide fallback if automated hash generation fails
3. **Implement improvements**:
- Update setup-homelab.sh with error handling
- Add pre-flight validation checks
- Enhance error messages with recovery steps
4. **Update documentation**:
- Document Round 6 findings
- Update troubleshooting guide
- Revise Round 4 references to current round
5. **Prepare for validation**:
- Create test checklist
- Document expected behavior
- Note any manual testing steps needed
---
*Generated for Round 6 testing of AI-Homelab deployment scripts*
*Focus: Repository optimization, not test server modification*

View File

@@ -1,273 +0,0 @@
# Round 7 Testing - Preparation and Safety Guidelines
## Mission Context
Test AI-Homelab deployment scripts with focus on **safe cleanup and recovery** procedures.
**Round 7 Status**: ✅ COMPLETE - All objectives met, no system crashes occurred
### Issues Found and Fixed:
1. **Password hash corruption** - Heredoc variable expansion was mangling `$` characters in argon2 hashes
- **Fixed**: Use quoted heredoc ('EOF') with placeholders, then sed replacement
- **Committed**: ee8a359
## Critical Safety Requirements - NEW for Round 7
### ⚠️ SYSTEM CRASH PREVENTION
**Issue from Round 6**: Aggressive cleanup operations caused system crashes requiring power cycles and BIOS recovery attempts.
**Root Causes Identified**:
1. Removing directories while Docker containers were actively using them
2. Aggressive `rm -rf` operations on Docker volumes while containers running
3. No graceful shutdown sequence before cleanup
4. Docker volume operations causing filesystem corruption
### Safe Testing Procedure
#### Before Each Test Run
1. **Always use the new reset script** instead of manual cleanup
2. **Never** run cleanup commands while containers are running
3. **Always** stop containers gracefully before removing files
4. **Monitor** system resources during operations
#### Using the Safe Reset Script
```bash
cd ~/AI-Homelab/scripts
sudo ./reset-test-environment.sh
```
This script:
- ✅ Stops all containers gracefully (proper shutdown)
- ✅ Waits for containers to fully stop
- ✅ Removes Docker volumes safely
- ✅ Cleans directories only after containers stopped
- ✅ Preserves system packages and settings
- ✅ Does NOT touch Docker installation
- ✅ Does NOT modify system files
#### What NOT to Do (Dangerous Operations)
```bash
# ❌ NEVER do these while containers are running:
rm -rf /opt/stacks/core/traefik # Can corrupt active containers
rm -rf /var/lib/docker/volumes/* # Filesystem corruption risk
docker volume rm $(docker volume ls -q) # Removes volumes containers need
find /var/lib/docker -exec rm -rf {} + # EXTREMELY DANGEROUS
# ❌ NEVER force remove running containers:
docker rm -f $(docker ps -aq) # Can cause state corruption
# ❌ NEVER use pkill on Docker processes:
pkill -9 docker # Can corrupt Docker daemon state
```
#### Safe Cleanup Sequence (Manual)
If you need to clean up manually:
```bash
# 1. Stop services gracefully
cd /opt/stacks/core
docker compose down # Waits for clean shutdown
# 2. Wait for full stop
sleep 5
# 3. Then and only then remove files
rm -rf /opt/stacks/core/traefik
rm -rf /opt/stacks/core/authelia
# 4. Remove volumes after containers stopped
docker volume rm core_authelia-data
```
## Round 7 Objectives
### Primary Goals
1. ✅ Verify safe reset script works without system crashes
2. ✅ Test full deployment after reset (round-trip testing)
3. ✅ Validate no file system corruption occurs
4. ✅ Ensure containers start cleanly after reset
5. ✅ Document any remaining edge cases
### Testing Checklist
#### Pre-Testing Setup
- [ ] System is stable and responsive
- [ ] All previous containers stopped cleanly
- [ ] Disk space sufficient (5GB+ free)
- [ ] No filesystem errors: `dmesg | grep -i error`
- [ ] Docker daemon healthy: `systemctl status docker`
#### Round 7 Test Sequence
1. **Clean slate** using reset script
```bash
sudo ./scripts/reset-test-environment.sh
```
- [ ] Script completes without errors
- [ ] System remains responsive
- [ ] No kernel panics or crashes
- [ ] All volumes removed cleanly
2. **Fresh deployment** with improved scripts
```bash
sudo ./scripts/setup-homelab.sh
```
- [ ] Completes successfully
- [ ] No permission errors
- [ ] Password hash generated correctly
- [ ] Credentials saved properly
3. **Deploy infrastructure**
```bash
sudo ./scripts/deploy-homelab.sh
```
- [ ] Containers start cleanly
- [ ] No file conflicts
- [ ] Authelia initializes properly
- [ ] Credentials work immediately
4. **Verify services**
- [ ] Traefik accessible and routing
- [ ] Authelia login works
- [ ] Dockge UI accessible
- [ ] SSL certificates generating
5. **Test reset again** (idempotency)
```bash
sudo ./scripts/reset-test-environment.sh
```
- [ ] Stops everything gracefully
- [ ] No orphaned containers
- [ ] No volume leaks
- [ ] System stable after reset
## Changes Made for Round 7
### New Files
- **`scripts/reset-test-environment.sh`** - Safe cleanup script with proper shutdown sequence
### Modified Files
- **`scripts/deploy-homelab.sh`**:
- Added graceful container stop before config file operations
- Removed automatic database cleanup (now use reset script instead)
- Added safety checks before rm operations
- Better warnings about existing databases
### Removed Dangerous Operations
```bash
# REMOVED from deploy-homelab.sh:
docker compose up -d authelia 2>&1 | grep -q "encryption key" && {
docker compose down authelia
sudo rm -rf /var/lib/docker/volumes/core_authelia-data/_data/*
}
# This was causing crashes - containers couldn't handle abrupt volume removal
# REMOVED blind directory removal:
rm -rf /opt/stacks/core/traefik /opt/stacks/core/authelia
# Now checks if containers are running first
```
## System Health Monitoring
### Before Each Test Run
```bash
# Check system health
free -h # Memory available
df -h # Disk space
dmesg | tail -20 # Recent kernel messages
systemctl status docker # Docker daemon health
docker ps # Running containers
```
### During Testing
- Monitor system logs: `journalctl -f`
- Watch Docker logs: `docker compose logs -f`
- Check resource usage: `htop` or `top`
### After Issues
```bash
# If system becomes unresponsive:
# 1. DO NOT hard power off immediately
# 2. Try to SSH in from another machine
# 3. Gracefully stop Docker: systemctl stop docker
# 4. Wait 30 seconds for disk writes to complete
# 5. Then reboot: systemctl reboot
# Check for filesystem corruption after boot:
sudo dmesg | grep -i error
sudo journalctl -xb | grep -i error
```
## Recovery Procedures
### If Deploy Script Hangs
```bash
# In another terminal:
cd /opt/stacks/core
docker compose ps # See what's running
docker compose logs authelia # Check for errors
docker compose down # Stop gracefully
# Then re-run deploy
```
### If Authelia Won't Start (Encryption Key Error)
```bash
# Use the reset script:
sudo ./scripts/reset-test-environment.sh
# Then start fresh deployment
```
### If System Crashed During Testing
```bash
# After reboot:
# 1. Check Docker state
systemctl status docker
docker ps -a # Look for crashed containers
# 2. Clean up properly
cd /opt/stacks/core
docker compose down --remove-orphans
# 3. Remove corrupted volumes
docker volume prune -f
# 4. Start fresh with reset script
sudo ./scripts/reset-test-environment.sh
```
## Success Criteria for Round 7
### Must Have
- ✅ Reset script completes without system crashes
- ✅ Can deploy and reset multiple times safely
- ✅ No filesystem corruption after any operation
- ✅ System remains responsive throughout testing
- ✅ All containers stop gracefully
### Should Have
- ✅ Clear warnings before destructive operations
- ✅ Confirmation prompts for cleanup
- ✅ Progress indicators during long operations
- ✅ Health checks before and after operations
### Nice to Have
- ⭐ Automatic backup before reset
- ⭐ Rollback capability
- ⭐ System health validation
- ⭐ Detailed logging of all operations
## Emergency Contacts / References
- Docker best practices: https://docs.docker.com/config/daemon/
- Linux filesystem safety: `man sync`, `man fsync`
- systemd service management: `man systemctl`
## Post-Round 7 Review
### Document These
- [ ] Any new crash scenarios discovered
- [ ] System resource usage patterns
- [ ] Time required for clean operations
- [ ] Any remaining unsafe operations
- [ ] User experience improvements needed
---
**Remember**: System stability is more important than testing speed. Always wait for clean shutdowns and never force operations.

View File

@@ -1,360 +0,0 @@
# Round 8 Testing - Full Integration and Edge Cases
## Mission Context
Comprehensive end-to-end testing of the complete AI-Homelab deployment workflow. Round 7 validated safe cleanup procedures and fixed credential handling. Round 8 focuses on **production readiness** and **edge case handling**.
## Status
- **Round 7 Results**: ✅ All objectives met
- Safe reset script prevents system crashes
- Password hash corruption fixed
- Full deployment successful without issues
- **Round 8 Focus**: Production readiness, edge cases, user experience polish
## Round 8 Objectives
### Primary Goals
1. ✅ Test complete workflow on fresh system (clean slate validation)
2. ✅ Verify all services are accessible and functional
3. ✅ Test SSL certificate generation (Let's Encrypt)
4. ✅ Validate SSO protection on all services
5. ✅ Test edge cases and error conditions
6. ✅ Verify documentation accuracy
### Testing Scenarios
#### Scenario 1: Fresh Installation (Happy Path)
```bash
# On fresh Debian 12 system
cd ~/AI-Homelab
cp .env.example .env
# Edit .env with actual values
sudo ./scripts/setup-homelab.sh
sudo ./scripts/deploy-homelab.sh
```
**Validation Checklist:**
- [ ] All 10 setup steps complete successfully
- [ ] Credentials created and displayed correctly
- [ ] All containers start and remain healthy
- [ ] Authelia login works immediately
- [ ] Traefik routes all services correctly
- [ ] SSL certificates generate within 5 minutes
- [ ] DuckDNS updates IP correctly
#### Scenario 2: Re-run Detection (Idempotency)
```bash
# Run setup again without changes
sudo ./scripts/setup-homelab.sh # Should detect existing setup
sudo ./scripts/deploy-homelab.sh # Should handle existing deployment
```
**Validation Checklist:**
- [ ] Setup script detects existing packages
- [ ] No duplicate networks or volumes created
- [ ] Secrets not regenerated unless requested
- [ ] Containers restart gracefully
- [ ] No data loss on re-deployment
#### Scenario 3: Password Reset Flow
```bash
# User forgets password and needs to reset
# Manual process through Authelia tools
docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2
# Update users_database.yml
# Restart Authelia
```
**Validation Checklist:**
- [ ] Hash generation command documented
- [ ] File location clearly documented
- [ ] Restart procedure works
- [ ] New password takes effect immediately
#### Scenario 4: Clean Slate Between Tests
```bash
sudo ./scripts/reset-test-environment.sh
# Verify system is clean
# Re-deploy
sudo ./scripts/setup-homelab.sh
sudo ./scripts/deploy-homelab.sh
```
**Validation Checklist:**
- [ ] Reset completes without errors
- [ ] No orphaned containers or volumes
- [ ] Fresh deployment works identically
- [ ] No configuration drift
#### Scenario 5: Service Access Validation
After deployment, test each service:
**Core Services:**
- [ ] DuckDNS: Check IP updates at duckdns.org
- [ ] Traefik: Access dashboard at `https://traefik.DOMAIN`
- [ ] Authelia: Login at `https://auth.DOMAIN`
- [ ] Gluetun: Verify VPN connection
**Infrastructure Services:**
- [ ] Dockge: Access at `https://dockge.DOMAIN`
- [ ] Pi-hole: Access at `https://pihole.DOMAIN`
- [ ] Dozzle: View logs at `https://dozzle.DOMAIN`
- [ ] Glances: System monitor at `https://glances.DOMAIN`
**Dashboards:**
- [ ] Homepage: Access at `https://home.DOMAIN`
- [ ] Homarr: Access at `https://homarr.DOMAIN`
#### Scenario 6: SSL Certificate Validation
```bash
# Wait 5 minutes after deployment
curl -I https://dockge.DOMAIN
# Should show valid Let's Encrypt certificate
# Check certificate details
openssl s_client -connect dockge.DOMAIN:443 -servername dockge.DOMAIN < /dev/null 2>/dev/null | openssl x509 -noout -issuer -dates
```
**Validation Checklist:**
- [ ] Certificates issued by Let's Encrypt
- [ ] No browser security warnings
- [ ] Valid for 90 days
- [ ] Auto-renewal configured
### Edge Cases to Test
#### Edge Case 1: Invalid .env Configuration
- Missing required variables
- Invalid domain format
- Incorrect email format
- Empty password fields
**Expected**: Clear error messages, script exits gracefully
#### Edge Case 2: Network Issues During Setup
- Slow internet connection
- Docker Hub rate limiting
- DNS resolution failures
**Expected**: Timeout handling, retry logic, clear error messages
#### Edge Case 3: Insufficient Resources
- Low disk space (< 5GB)
- Limited memory
- CPU constraints
**Expected**: Pre-flight checks catch issues, clear warnings
#### Edge Case 4: Port Conflicts
- Ports 80/443 already in use
- Other services conflicting
**Expected**: Clear error about port conflicts, suggestion to stop conflicting services
#### Edge Case 5: Docker Daemon Issues
- Docker not running
- User not in docker group
- Docker version too old
**Expected**: Clear instructions to fix, graceful failure
### Documentation Validation
Review and test all documentation:
**README.md:**
- [ ] Quick start instructions accurate
- [ ] Prerequisites clearly listed
- [ ] Feature list up to date
**docs/getting-started.md:**
- [ ] Step-by-step guide works
- [ ] Screenshots/examples current
- [ ] Troubleshooting section helpful
**AGENT_INSTRUCTIONS.md:**
- [ ] Reflects current patterns
- [ ] Round 8 updates included
- [ ] Testing guidelines current
**Scripts README:**
- [ ] Each script documented
- [ ] Parameters explained
- [ ] Examples provided
### Performance Testing
#### Deployment Speed
- [ ] Setup script: < 10 minutes (excluding NVIDIA)
- [ ] Deploy script: < 5 minutes
- [ ] Reset script: < 2 minutes
#### Resource Usage
- [ ] Memory usage reasonable (< 4GB for core services)
- [ ] CPU usage acceptable
- [ ] Disk space usage documented
#### Container Health
```bash
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Health}}"
```
- [ ] All containers show "Up"
- [ ] Health checks passing
- [ ] No restart loops
### Security Validation
#### Authentication
- [ ] SSO required for admin interfaces
- [ ] Plex/Jellyfin bypass works
- [ ] Password complexity enforced
- [ ] Session management working
#### Network Security
- [ ] Firewall configured correctly
- [ ] Only required ports open
- [ ] Docker networks isolated
- [ ] Traefik not exposing unnecessary endpoints
#### Secrets Management
- [ ] No secrets in git
- [ ] .env file permissions correct (600)
- [ ] Credentials files protected
- [ ] Password files owned by user
### User Experience Improvements
#### Messages and Prompts
- [ ] All log messages clear and actionable
- [ ] Error messages include solutions
- [ ] Success messages confirm what happened
- [ ] Progress indicators for long operations
#### Help and Guidance
- [ ] --help flag on all scripts
- [ ] Examples in comments
- [ ] Links to documentation
- [ ] Troubleshooting hints
#### Visual Presentation
- [ ] Colors used effectively
- [ ] Important info highlighted
- [ ] Credentials displayed prominently
- [ ] Boxes/separators for clarity
### Known Issues to Verify Fixed
1.**Round 6**: Password hash extraction format
2.**Round 6**: Credential flow between scripts
3.**Round 6**: Directory vs file conflicts
4.**Round 7**: System crashes from cleanup
5.**Round 7**: Password hash corruption from heredoc
### Potential Improvements for Future Rounds
#### Script Enhancements
- Add `--dry-run` mode to preview changes
- Add `--verbose` flag for detailed logging
- Add `--skip-prompts` for automation
- Add backup/restore functionality
#### Monitoring
- Health check dashboard
- Automated testing suite
- Continuous integration
- Deployment validation
#### Documentation
- Video walkthrough
- Troubleshooting database
- Common patterns guide
- Architecture diagrams
## Success Criteria for Round 8
### Must Have ✅
- [ ] Fresh installation works flawlessly
- [ ] All services accessible via HTTPS
- [ ] SSO working on all protected services
- [ ] SSL certificates generate automatically
- [ ] Documentation matches actual behavior
- [ ] No critical bugs or crashes
- [ ] Reset/redeploy works reliably
### Should Have ⭐
- [ ] All edge cases handled gracefully
- [ ] Performance meets targets
- [ ] Security best practices followed
- [ ] User experience polished
- [ ] Help/documentation easily found
### Nice to Have 🎯
- [ ] Automated testing framework
- [ ] CI/CD pipeline
- [ ] Health monitoring
- [ ] Backup/restore tools
- [ ] Migration guides
## Testing Procedure
### Day 1: Fresh Install Testing
1. Provision fresh Debian 12 VM/server
2. Clone repository
3. Configure .env
4. Run setup script
5. Run deploy script
6. Validate all services
7. Document any issues
### Day 2: Edge Case Testing
1. Test invalid configurations
2. Test network failure scenarios
3. Test resource constraints
4. Test port conflicts
5. Test Docker issues
6. Document findings
### Day 3: Integration Testing
1. Test SSL certificate generation
2. Test SSO on all services
3. Test service interactions
4. Test external access
5. Test mobile access
6. Document results
### Day 4: Reset and Re-deploy
1. Reset environment
2. Re-deploy from scratch
3. Verify identical results
4. Test multiple reset cycles
5. Validate no degradation
### Day 5: Documentation and Polish
1. Update all documentation
2. Fix any UX issues found
3. Add missing error handling
4. Improve help messages
5. Final validation pass
## Post-Round 8 Activities
### If Successful
- Tag release as v1.0.0-rc1
- Create release notes
- Prepare for beta testing
- Plan Round 9 (production testing with real users)
### If Issues Found
- Document all issues
- Prioritize fixes
- Implement solutions
- Plan Round 8.1 re-test
## Notes for Round 8
- Focus on **production readiness** - scripts should be reliable enough for public release
- Document **everything** - assume users have no prior Docker knowledge
- Test **realistically** - use actual domain names, real SSL certificates
- Think **long-term** - consider maintenance, updates, migrations
---
**Remember**: Round 8 is about validating that the entire system is ready for real-world use by actual users, not just test environments.

View File

@@ -1,361 +0,0 @@
# Round 9 Testing - Bug Fixes and Improvements
## Mission Context
Based on successful Round 8 deployment, this round focused on fixing issues discovered during testing and improving repository quality.
## Status
- **Testing Date**: January 14, 2026
- **Test System**: Debian 12 local environment
- **Deployment Status**: Core, infrastructure, dashboards, and media deployed successfully
- **Issues Found**: 11 actionable bugs/improvements identified
## Issues Identified and Fixed
### 1. ✅ Authelia Session Timeout Too Short
**Problem**: Session timeouts set to 1h expiration and 5m inactivity were too aggressive
**Impact**: Users had to re-login frequently, poor UX
**Fix**: Updated [config-templates/authelia/configuration.yml](config-templates/authelia/configuration.yml#L60-L65)
- Changed `expiration: 1h``24h`
- Changed `inactivity: 5m``24h`
- Added helpful comments explaining values
### 2. ✅ Homepage Dashboard References Old Stack Name
**Problem**: Homepage still referred to `media-extended` stack (renamed to `media-management`)
**Impact**: Confusing documentation, inconsistent naming
**Fix**: Updated [config-templates/homepage/services.yaml](config-templates/homepage/services.yaml#L91)
- Changed "Media Extended Stack (media-extended.yml)" → "Media Management Stack (media-management.yml)"
### 3. ✅ Old Media-Extended Directory
**Problem**: Developer notes mentioned obsolete `media-extended` folder
**Status**: Verified folder doesn't exist - already cleaned up in previous round
**Action**: Marked as complete (no action needed)
### 4. ✅ Media-Management Stack - Invalid Image Tags
**Problem**: Multiple services using `:latest` tags (anti-pattern) and invalid volume paths with bash expressions `$(basename $file .yml)`
**Impact**: Unpredictable deployments, broken volume mounts
**Fix**: Updated [docker-compose/media-management.yml](docker-compose/media-management.yml)
**Image Tag Fixes**:
- `lidarr:latest``lidarr:2.0.7`
- `lazylibrarian:latest``lazylibrarian:1.10.0`
- `mylar3:latest``mylar3:0.7.0`
- `jellyseerr:latest``jellyseerr:1.7.0`
- `flaresolverr:latest``flaresolverr:v3.3.16`
- `tdarr:latest``tdarr:2.17.01`
- `tdarr_node:latest``tdarr_node:2.17.01`
- `unmanic:latest``unmanic:0.2.5`
- Kept `readarr:develop` (still in active development)
**Volume Path Fixes**:
- Fixed all instances of `./$(basename $file .yml)/config``./service-name/config`
- Fixed inconsistent absolute paths → relative paths (`./<service>/config`)
- Added service access URLs section at top of file
### 5. ✅ Utilities Stack - Invalid Image Tags
**Problem**: Similar issues with `:latest` tags and bash volume expressions
**Fix**: Updated [docker-compose/utilities.yml](docker-compose/utilities.yml)
**Image Tag Fixes**:
- `backrest:latest``backrest:v1.1.0`
- `duplicati:latest``duplicati:2.0.7`
- `formio:latest``formio:2.4.1`
- `mongo:6``mongo:6.0` (more specific)
- `vaultwarden:latest``vaultwarden:1.30.1`
- `redis:alpine``redis:7-alpine` (more specific)
**Volume Path Fixes**:
- Fixed bash expressions → proper relative paths
- Standardized to `./service/config` pattern
- Added service access URLs section
### 6. ✅ Monitoring Stack Errors
**Problem**: Prometheus, Loki, and Promtail reported errors during deployment
**Investigation**: Config templates exist in `config-templates/` but may not be copied during deployment
**Fix**: Added service access URLs section to [docker-compose/monitoring.yml](docker-compose/monitoring.yml)
**Note**: Config file copying should be verified in deployment script
### 7. ✅ Nextcloud Untrusted Domain Error
**Problem**: Nextcloud showed "untrusted domain" error in browser
**Root Cause**:
- `NEXTCLOUD_TRUSTED_DOMAINS` set to `${DOMAIN}` instead of `nextcloud.${DOMAIN}`
- Missing `OVERWRITEHOST` environment variable
**Fix**: Updated [docker-compose/productivity.yml](docker-compose/productivity.yml) Nextcloud service:
```yaml
environment:
- NEXTCLOUD_TRUSTED_DOMAINS=nextcloud.${DOMAIN} # Full subdomain
- OVERWRITEHOST=nextcloud.${DOMAIN} # Added for proper URL handling
```
### 8. ✅ Productivity Stack - 404 Errors on Services
**Problem**: Services other than Mealie gave 404 errors in browser
**Root Cause**: Multiple issues:
- Invalid volume paths with `$(basename $file .yml)` expressions
- `:latest` image tags causing version mismatches
- Absolute paths instead of relative paths
**Fix**: Updated [docker-compose/productivity.yml](docker-compose/productivity.yml)
**Image Tag Fixes**:
- `nextcloud:latest``nextcloud:28`
- `mealie:latest``mealie:v1.0.0`
- `wordpress:latest``wordpress:6.4`
- `gitea:latest``gitea:1.21`
- `dokuwiki:latest``dokuwiki:20231007`
- `bookstack:latest``bookstack:23.12`
- `mediawiki:latest``mediawiki:1.41`
**Volume Path Fixes**:
- All services now use relative paths: `./service-name/config`
- Removed bash expressions
- Standardized structure across all services
### 9. ✅ Missing Service Access URLs in Compose Files
**Problem**: No easy reference for service URLs in Dockge UI
**Impact**: Users had to guess URLs or search documentation
**Fix**: Added commented "Service Access URLs" sections to ALL compose files:
- ✅ [docker-compose/core.yml](docker-compose/core.yml)
- ✅ [docker-compose/infrastructure.yml](docker-compose/infrastructure.yml)
- ✅ [docker-compose/dashboards.yml](docker-compose/dashboards.yml)
- ✅ [docker-compose/media.yml](docker-compose/media.yml)
- ✅ [docker-compose/media-management.yml](docker-compose/media-management.yml)
- ✅ [docker-compose/monitoring.yml](docker-compose/monitoring.yml)
- ✅ [docker-compose/productivity.yml](docker-compose/productivity.yml)
- ✅ [docker-compose/utilities.yml](docker-compose/utilities.yml)
- ✅ [docker-compose/homeassistant.yml](docker-compose/homeassistant.yml)
**Example Format**:
```yaml
# Service Access URLs:
# - Service1: https://service1.${DOMAIN}
# - Service2: https://service2.${DOMAIN}
# - Service3: No web UI (backend service)
```
### 10. ✅ Zigbee2MQTT Device Path Error
**Problem**: zigbee2mqtt container failed because `/dev/ttyACM0` USB device doesn't exist on test system
**Impact**: Stack deployment fails if user doesn't have Zigbee USB adapter
**Fix**: Updated [docker-compose/homeassistant.yml](docker-compose/homeassistant.yml)
**Changes**:
- Commented out `devices:` section with instructions
- Added notes about USB adapter requirement
- Provided common device paths: `/dev/ttyACM0`, `/dev/ttyUSB0`, `/dev/serial/by-id/...`
- Added command to find adapter: `ls -l /dev/serial/by-id/`
- Pinned image: `koenkk/zigbee2mqtt:latest``koenkk/zigbee2mqtt:1.35.1`
- Fixed volume path: `/opt/stacks/zigbee2mqtt/data``./zigbee2mqtt/data`
### 11. ⏳ Resource Limits Not Implemented (Deferred)
**Problem**: No CPU/memory limits on containers
**Impact**: Services can consume all system resources
**Status**: NOT FIXED - Deferred to future round
**Reason**: Need to test resource requirements per service first
**Plan**: Add deploy.resources section to compose files in future round
**Example for future implementation**:
```yaml
deploy:
resources:
limits:
cpus: '2.0'
memory: 2G
reservations:
cpus: '0.5'
memory: 512M
```
## Summary of Changes
### Files Modified
1. `config-templates/authelia/configuration.yml` - Session timeouts
2. `config-templates/homepage/services.yaml` - Stack name reference
3. `docker-compose/core.yml` - Service URLs
4. `docker-compose/infrastructure.yml` - Service URLs
5. `docker-compose/dashboards.yml` - Service URLs
6. `docker-compose/media.yml` - Service URLs
7. `docker-compose/media-management.yml` - Image tags, volume paths, URLs
8. `docker-compose/monitoring.yml` - Service URLs
9. `docker-compose/productivity.yml` - Image tags, volume paths, URLs, Nextcloud fix
10. `docker-compose/utilities.yml` - Image tags, volume paths, URLs
11. `docker-compose/homeassistant.yml` - Zigbee2MQTT fix, image tags, volume paths, URLs
### New File Created
- `AGENT_INSTRUCTIONS_DEV.md` - Development-focused agent instructions
## Testing Validation
### Pre-Fix Status
- ✅ Core stack: Deployed successfully
- ✅ Infrastructure stack: Deployed successfully
- ✅ Dashboards stack: Deployed successfully
- ✅ Media stack: Deployed successfully
- ⚠️ Media-management stack: Invalid image tags
- ⚠️ Utilities stack: Invalid image tags
- ⚠️ Monitoring stack: Prometheus/Loki/Promtail errors
- ⚠️ Productivity stack: Nextcloud untrusted domain, other services 404
- ⚠️ Home Assistant stack: Zigbee2MQTT device error
### Post-Fix Expected Results
- ✅ All image tags pinned to specific versions
- ✅ All volume paths use relative `./<service>/config` pattern
- ✅ All compose files have service access URLs section
- ✅ Nextcloud will accept connections without "untrusted domain" error
- ✅ Zigbee2MQTT won't prevent stack deployment (devices commented out)
- ✅ Authelia session lasts 24 hours (better UX)
- ✅ Homepage references correct stack names
### Remaining Tasks
- [ ] Test re-deployment with fixes
- [ ] Verify Nextcloud trusted domains working
- [ ] Verify all services accessible via URLs
- [ ] Test Prometheus/Loki/Promtail with proper configs
- [ ] Implement resource limits (future round)
- [ ] Verify monitoring stack config file deployment
## Deployment Script Improvements Needed
### Config File Deployment
The deploy script should copy config templates for monitoring stack:
- `config-templates/prometheus/prometheus.yml``/opt/stacks/monitoring/config/prometheus/prometheus.yml`
- `config-templates/loki/loki-config.yml``/opt/stacks/monitoring/config/loki/loki-config.yml`
- `config-templates/promtail/promtail-config.yml``/opt/stacks/monitoring/config/promtail/promtail-config.yml`
**Action Item**: Update `scripts/deploy-homelab.sh` to handle monitoring configs
## Best Practices Established
### 1. Image Tag Standards
- ✅ Always pin specific versions (e.g., `service:1.2.3`)
- ❌ Never use `:latest` in production compose files
- ⚠️ Exception: Services in active development may use `:develop` or `:nightly` with clear comments
### 2. Volume Path Standards
- ✅ Use relative paths for configs: `./service-name/config:/config`
- ✅ Use absolute paths for large data: `/mnt/media:/media`
- ❌ Never use bash expressions in compose files: `$(basename $file .yml)`
- ✅ Keep data in stack directory when < 10GB
### 3. Service Documentation Standards
- ✅ Every compose file must have "Service Access URLs" section at top
- ✅ Include notes about SSO bypass (Plex, Jellyfin)
- ✅ Document special requirements (USB devices, external drives)
- ✅ Use comments to explain non-obvious configurations
### 4. Optional Hardware Requirements
- ✅ Comment out hardware device sections by default
- ✅ Provide clear instructions for uncommenting
- ✅ List common device paths
- ✅ Provide commands to find device paths
- ✅ Don't prevent deployment for optional features
## Quality Improvements
### Repository Health
- **Before**: 40+ services with `:latest` tags
- **After**: All services pinned to specific versions
- **Impact**: Predictable deployments, easier rollbacks
### User Experience
- **Before**: No URL reference, users had to guess
- **After**: Every compose file lists service URLs
- **Impact**: Faster service access, less documentation lookup
### Deployment Reliability
- **Before**: Volume path bash expressions caused failures
- **After**: All paths use proper compose syntax
- **Impact**: Deployments work in all environments
### Configuration Accuracy
- **Before**: Nextcloud rejected connections (untrusted domain)
- **After**: Proper domain configuration for reverse proxy
- **Impact**: Service works immediately after deployment
## Lessons Learned
### 1. Volume Path Patterns
Bash expressions like `$(basename $file .yml)` don't work in Docker Compose context. Always use:
- Relative paths: `./service-name/config`
- Environment variables: `${STACK_NAME}/config`
- Fixed strings: `/opt/stacks/service-name/config`
### 2. Image Tag Strategy
Using `:latest` causes:
- Unpredictable behavior after updates
- Difficult troubleshooting (which version?)
- Breaking changes without warning
Solution: Pin all tags to specific versions
### 3. Optional Hardware Handling
Don't make deployment fail for optional features:
- Comment out device mappings by default
- Provide clear enabling instructions
- Test deployment without optional hardware
- Document required vs. optional components
### 4. Documentation in Code
Service URLs in compose files are incredibly valuable:
- Users find services faster
- Dockge UI shows URLs in file view
- No need to search external documentation
- Self-documenting infrastructure
## Next Steps
### Immediate (Round 9 Continuation)
1. Test re-deployment with all fixes
2. Validate Nextcloud trusted domains
3. Verify all service URLs work
4. Check monitoring stack functionality
### Short-term (Round 10)
1. Implement resource limits per service
2. Test resource limit effectiveness
3. Add healthcheck configurations
4. Improve monitoring stack config deployment
### Long-term
1. Create automated testing framework
2. Add validation script for compose files
3. Implement pre-deployment checks
4. Create rollback procedures
## Success Metrics
### Fixes Completed: 10/11 (91%)
- ✅ Authelia session timeout
- ✅ Homepage stack name
- ✅ Media-extended cleanup (already done)
- ✅ Media-management image tags
- ✅ Utilities image tags
- ✅ Monitoring stack URLs
- ✅ Nextcloud trusted domains
- ✅ Productivity stack fixes
- ✅ Service URL sections
- ✅ Zigbee2MQTT device handling
- ⏳ Resource limits (deferred)
### Code Quality Improvements
- **Image Tags**: 40+ services now properly versioned
- **Volume Paths**: 20+ services fixed to use relative paths
- **Documentation**: 9 compose files now have URL sections
- **Error Handling**: 2 services made deployment-optional
### User Experience Improvements
- **Session Duration**: 24h vs 1h (24x better)
- **Service Discovery**: URL sections in all files
- **Error Messages**: Clear instructions for optional features
- **Reliability**: No more bash expression volume errors
## Conclusion
Round 9 successfully addressed all critical issues found during Round 8 testing. The repository is now significantly more reliable, maintainable, and user-friendly.
**Key Achievements**:
- Eliminated `:latest` tag anti-pattern across entire codebase
- Standardized volume paths to relative pattern
- Added comprehensive URL documentation to all stacks
- Fixed critical Nextcloud deployment issue
- Made optional hardware features non-blocking
**Repository Status**: Ready for fresh installation testing on Round 10

View File

@@ -0,0 +1,19 @@
http:
routers:
# Individual Services
homeassistant:
rule: "Host(`hass.${DOMAIN}`)"
entryPoints:
- websecure
service: homeassistant
tls:
certResolver: letsencrypt
middlewares:
- authelia@docker
services:
# Individual Services
homeassistant:
loadBalancer:
servers:
- url: "http://${HOMEASSISTANT_IP}:8123"
passHostHeader: true

View File

@@ -0,0 +1,580 @@
http:
routers:
backrest-jarvis:
rule: "Host(`backrest.${DOMAIN}`)"
entryPoints:
- websecure
service: backrest-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-backrest@file
- authelia@docker
bookstack-jarvis:
rule: "Host(`bookstack.${DOMAIN}`)"
entryPoints:
- websecure
service: bookstack-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-bookstack@file
- authelia@docker
bitwarden-jarvis:
rule: "Host(`bitwarden.${DOMAIN}`)"
entryPoints:
- websecure
service: bitwarden-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-bitwarden@file
- authelia@docker
calibre-web-jarvis:
rule: "Host(`calibre.${DOMAIN}`)"
entryPoints:
- websecure
service: calibre-web-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-calibre-web@file
- authelia@docker
code-jarvis:
rule: "Host(`code.${DOMAIN}`)"
entryPoints:
- websecure
service: code-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-code-server@file
- authelia@docker
dockge-jarvis:
rule: "Host(`jarvis.${DOMAIN}`)"
entryPoints:
- websecure
service: dockge-jarvis
tls:
certResolver: letsencrypt
middlewares:
- authelia@docker
dockhand-jarvis:
rule: "Host(`dockhand.${DOMAIN}`)"
entryPoints:
- websecure
service: dockhand-jarvis
tls:
certResolver: letsencrypt
middlewares:
- authelia@docker
dokuwiki-jarvis:
rule: "Host(`dokuwiki.${DOMAIN}`)"
entryPoints:
- websecure
service: dokuwiki-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-dokuwiki@file
- authelia@docker
dozzle-jarvis:
rule: "Host(`dozzle.${DOMAIN}`)"
entryPoints:
- websecure
service: dozzle-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-dozzle@file
- authelia@docker
duplicati-jarvis:
rule: "Host(`duplicati.${DOMAIN}`)"
entryPoints:
- websecure
service: duplicati-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-duplicati@file
- authelia@docker
formio-jarvis:
rule: "Host(`formio.${DOMAIN}`)"
entryPoints:
- websecure
service: formio-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-formio@file
- authelia@docker
gitea-jarvis:
rule: "Host(`gitea.${DOMAIN}`)"
entryPoints:
- websecure
service: gitea-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-gitea@file
- authelia@docker
glances-jarvis:
rule: "Host(`glances.jarvis.${DOMAIN}`)"
entryPoints:
- websecure
service: glances-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-glances@file
- authelia@docker
homepage-jarvis:
rule: "Host(`homepage.jarvis.${DOMAIN}`)"
entryPoints:
- websecure
service: homepage-jarvis
tls:
certResolver: letsencrypt
middlewares:
- authelia@docker
homarr-jarvis:
rule: "Host(`homarr.${DOMAIN}`)"
entryPoints:
- websecure
service: homarr-jarvis
tls:
certResolver: letsencrypt
middlewares:
- authelia@docker
- sablier-jarvis-homarr@file
jellyfin-jarvis:
rule: "Host(`jellyfin.${DOMAIN}`)"
entryPoints:
- websecure
service: jellyfin-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-jellyfin@file
# No authelia middleware for media apps
kopia-jarvis:
rule: "Host(`kopia.${DOMAIN}`)"
entryPoints:
- websecure
service: kopia-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-kopia@file
- authelia@docker
mealie-jarvis:
rule: "Host(`mealie.${DOMAIN}`)"
entryPoints:
- websecure
service: mealie-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-mealie@file
- authelia@docker
motioneye-jarvis:
rule: "Host(`motioneye.${DOMAIN}`)"
entryPoints:
- websecure
service: motioneye-jarvis
tls:
certResolver: letsencrypt
middlewares:
- authelia@docker
mediawiki-jarvis:
rule: "Host(`mediawiki.${DOMAIN}`)"
entryPoints:
- websecure
service: mediawiki-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-mediawiki@file
- authelia@docker
nextcloud-jarvis:
rule: "Host(`nextcloud.${DOMAIN}`)"
entryPoints:
- websecure
service: nextcloud-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-nextcloud@file
- authelia@docker
openkm-jarvis:
rule: "Host(`openkm.${DOMAIN}`)"
entryPoints:
- websecure
service: openkm-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-openkm@file
- authelia@docker
openwebui-jarvis:
rule: "Host(`openwebui.${DOMAIN}`)"
entryPoints:
- websecure
service: openwebui-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-openwebui@file
- authelia@docker
qbittorrent-jarvis:
rule: "Host(`torrents.${DOMAIN}`)"
entryPoints:
- websecure
service: qbittorrent-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-arr@file
- authelia@docker
tdarr-jarvis:
rule: "Host(`tdarr.${DOMAIN}`)"
entryPoints:
- websecure
service: tdarr-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-arr@file
- authelia@docker
unmanic-jarvis:
rule: "Host(`unmanic.${DOMAIN}`)"
entryPoints:
- websecure
service: unmanic-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-unmanic@file
- authelia@docker
wordpress-jarvis:
rule: "Host(`knot-u.${DOMAIN}`)"
entryPoints:
- websecure
service: wordpress-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-wordpress@file
- authelia@file
# Arr Services (no SSO for media apps)
jellyseerr-jarvis:
rule: "Host(`jellyseerr.${DOMAIN}`)"
entryPoints:
- websecure
service: jellyseerr-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-arr@file
- authelia@docker
prowlarr-jarvis:
rule: "Host(`prowlarr.${DOMAIN}`)"
entryPoints:
- websecure
service: prowlarr-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-arr@file
- authelia@docker
radarr-jarvis:
rule: "Host(`radarr.${DOMAIN}`)"
entryPoints:
- websecure
service: radarr-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-arr@file
- authelia@docker
sonarr-jarvis:
rule: "Host(`sonarr.${DOMAIN}`)"
entryPoints:
- websecure
service: sonarr-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-arr@file
- authelia@docker
lidarr-jarvis:
rule: "Host(`lidarr.${DOMAIN}`)"
entryPoints:
- websecure
service: lidarr-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-arr@file
- authelia@docker
readarr-jarvis:
rule: "Host(`readarr.${DOMAIN}`)"
entryPoints:
- websecure
service: readarr-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-arr@file
- authelia@docker
mylar3-jarvis:
rule: "Host(`mylar3.${DOMAIN}`)"
entryPoints:
- websecure
service: mylar3-jarvis
tls:
certResolver: letsencrypt
middlewares:
- sablier-jarvis-arr@file
- authelia@docker
services:
backrest-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:9898"
passHostHeader: true
bitwarden-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8000"
passHostHeader: true
bookstack-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:6875"
passHostHeader: true
calibre-web-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8083"
passHostHeader: true
code-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8079"
passHostHeader: true
dockge-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:5001"
passHostHeader: true
dockhand-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:3003"
passHostHeader: true
dokuwiki-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8087"
passHostHeader: true
dozzle-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8085"
passHostHeader: true
duplicati-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8200"
passHostHeader: true
formio-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:3002"
passHostHeader: true
gitea-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:3010"
passHostHeader: true
glances-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:61208"
passHostHeader: true
homarr-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:7575"
passHostHeader: true
homepage-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:3000"
passHostHeader: true
jellyfin-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8096"
passHostHeader: true
kopia-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:51515"
passHostHeader: true
mealie-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:9000"
passHostHeader: true
mediawiki-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8084"
passHostHeader: true
motioneye-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8081"
passHostHeader: true
nextcloud-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8089"
passHostHeader: true
openkm-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:18080"
passHostHeader: true
openwebui-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:3000"
passHostHeader: true
qbittorrent-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8080"
passHostHeader: true
tdarr-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8265"
passHostHeader: true
unmanic-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8888"
passHostHeader: true
wordpress-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8088"
passHostHeader: true
# Arr Services
jellyseerr-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:5055"
passHostHeader: true
prowlarr-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:9696"
passHostHeader: true
radarr-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:7878"
passHostHeader: true
sonarr-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8989"
passHostHeader: true
lidarr-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8686"
passHostHeader: true
readarr-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8787"
passHostHeader: true
mylar3-jarvis:
loadBalancer:
servers:
- url: "http://192.168.4.11:8090"
passHostHeader: true

View File

@@ -0,0 +1,309 @@
http:
middlewares:
sablier-jarvis-arr:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-arr
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Arr Apps
theme: ghost
show-details-by-default: true
sablier-jarvis-backrest:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-backrest
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Backrest
theme: ghost
show-details-by-default: true
sablier-jarvis-bookstack:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-bookstack
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Bookstack
theme: ghost
show-details-by-default: true
sablier-jarvis-jellyfin:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-jellyfin
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Jellyfin
theme: ghost
show-details-by-default: true
sablier-jarvis-calibre-web:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-calibre-web
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Calibre Web
theme: ghost
show-details-by-default: true
sablier-jarvis-code-server:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-code-server
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Code Server
theme: ghost
show-details-by-default: true
sablier-jarvis-bitwarden:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-bitwarden
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: bitwarden
theme: ghost
show-details-by-default: true
sablier-jarvis-wordpress:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-wordpress
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: wordpress
theme: ghost
show-details-by-default: true
sablier-jarvis-nextcloud:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-nextcloud
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: NextCloud
theme: ghost
show-details-by-default: true
sablier-jarvis-mediawiki:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-mediawiki
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: mediawiki
theme: ghost
show-details-by-default: true
sablier-jarvis-mealie:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-mealie
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Mealie
theme: ghost
show-details-by-default: true
sablier-jarvis-gitea:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-gitea
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Gitea
theme: ghost
show-details-by-default: true
sablier-jarvis-formio:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-formio
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: FormIO
theme: ghost
show-details-by-default: true
sablier-jarvis-dozzle:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-dozzle
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: dozzle
theme: ghost
show-details-by-default: true
sablier-jarvis-duplicati:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-duplicati
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Duplicati
theme: ghost
show-details-by-default: true
sablier-jarvis-glances:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-glances
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Glances
theme: ghost
show-details-by-default: true
sablier-jarvis-homarr:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-homarr
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Homarr
theme: ghost
show-details-by-default: true
sablier-jarvis-komodo:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-komodo
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Komodo
theme: ghost
show-details-by-default: true
sablier-jarvis-kopia:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-kopia
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Kopia
theme: ghost
show-details-by-default: true
sablier-jarvis-openkm:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-openkm
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: OpenKM
theme: ghost
show-details-by-default: true
sablier-jarvis-openwebui:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-openwebui
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: OpenWebUI
theme: ghost
show-details-by-default: true
sablier-jarvis-pulse:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-pulse
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Pulse
theme: ghost
show-details-by-default: true
sablier-jarvis-tdarr:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-tdarr
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Tdarr
theme: ghost
show-details-by-default: true
sablier-jarvis-unmanic:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-unmanic
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: Unmanic
theme: ghost
show-details-by-default: true
sablier-jarvis-dokuwiki:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: jarvis-dokuwiki
sessionDuration: 30m
ignoreUserAgent: curl
dynamic:
displayName: DokuWiki
theme: ghost
show-details-by-default: true
authelia:
forwardauth:
address: http://authelia:9091/api/verify?rd=https://auth.${DOMAIN}/
authResponseHeaders:
- X-Secret
trustForwardHeader: true

View File

@@ -100,10 +100,40 @@ PGID=1000
TZ=America/New_York
USERDIR=/home/username/homelab
DATADIR=/mnt/data
```
Never commit `.env` files to git! Use `.env.example` as a template instead.
## Labels
### To enable Authelia SSO
```yaml
```
### Traefik routing labels
If Traekif is on the same server add these labels.
```yaml
```
>If Traefik is on a seperate server, don't use traekfik labels in compose files, use an external host yaml file.
### Sablier middleware labels
Add these labels to enable ondemand functionality.
```yaml
labels:
- sablier.enable=true
- sablier.group=<server>-<service name>
- sablier.start-on-demand=true
```
## Best Practices
1. **Pin Versions**: Always specify image versions (e.g., `nginx:1.25.3` not `nginx:latest`)

View File

@@ -1,134 +1,115 @@
# Core Infrastructure Stack
# Essential services required for the homelab to function
# Deploy this stack FIRST before any other services
# Place in /opt/stacks/core/docker-compose.yml
# Service Access URLs:
# - DuckDNS: No web UI (updates IP automatically)
# - Traefik: https://traefik.${DOMAIN}
# - Authelia: https://auth.${DOMAIN}
x-dockge:
urls:
- https://auth.${DOMAIN}
services:
# DuckDNS - Dynamic DNS updater
# Updates your public IP automatically for Let's Encrypt SSL
duckdns:
image: lscr.io/linuxserver/duckdns:latest
container_name: duckdns
restart: unless-stopped
deploy:
resources:
limits:
cpus: '0.10' # Minimal CPU for DNS updates
memory: 64M # Very low memory usage
pids: 128 # Minimal processes
reservations:
cpus: '0.05'
memory: 32M
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- TZ=${TZ}
- SUBDOMAINS=${DUCKDNS_SUBDOMAINS} # Your subdomain(s), comma separated
- TOKEN=${DUCKDNS_TOKEN} # Your DuckDNS token
- UPDATE_IP=ipv4 # or ipv6, or both
- SUBDOMAINS=${DUCKDNS_SUBDOMAINS}
- TOKEN=${DUCKDNS_TOKEN}
volumes:
- ./duckdns:/config
labels:
- "homelab.category=infrastructure"
- "homelab.description=Dynamic DNS updater"
# Traefik - Reverse proxy with automatic SSL
# Routes all traffic and manages Let's Encrypt certificates
traefik:
image: traefik:v2.11
container_name: traefik
restart: unless-stopped
security_opt:
- no-new-privileges:true
deploy:
resources:
limits:
cpus: '0.50' # Limit to 50% of one CPU core
memory: 256M # Limit to 256MB RAM
reservations:
cpus: '0.25' # Reserve 25% of one CPU core
memory: 128M # Reserve 128MB RAM
- ./duckdns/config:/config
networks:
- traefik-network
ports:
- "80:80" # HTTP
- "443:443" # HTTPS
- "8080:8080" # Dashboard (protected with Authelia)
volumes:
- /etc/localtime:/etc/localtime:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./traefik/traefik.yml:/traefik.yml:ro
- ./traefik/dynamic:/dynamic:ro
- ./traefik/acme.json:/acme.json
traefik:
image: traefik:v3
container_name: traefik
restart: unless-stopped
command: ["--configFile=/config/traefik.yml"]
environment:
- CF_DNS_API_TOKEN=${CF_DNS_API_TOKEN} # If using Cloudflare DNS challenge
- DUCKDNS_TOKEN=${DUCKDNS_TOKEN} # If using DuckDNS
- DUCKDNS_TOKEN=${DUCKDNS_TOKEN}
ports:
- 80:80
- 443:443
- 8080:8080
volumes:
- ./traefik/config:/config
- ./traefik/letsencrypt:/letsencrypt
- ./traefik/dynamic:/dynamic
- /var/run/docker.sock:/var/run/docker.sock:ro
networks:
- traefik-network
labels:
- "traefik.enable=true"
# Dashboard
- "traefik.http.routers.traefik.rule=Host(`traefik.${DOMAIN}`)"
- "traefik.http.routers.traefik.entrypoints=websecure"
- "traefik.http.routers.traefik.tls.certresolver=letsencrypt"
- "traefik.http.routers.traefik.tls.domains[0].main=${DOMAIN}"
- "traefik.http.routers.traefik.tls.domains[0].sans=*.${DOMAIN}"
- "traefik.http.routers.traefik.middlewares=authelia@docker"
- "traefik.http.routers.traefik.service=api@internal"
# Global HTTP to HTTPS redirect
- "traefik.http.routers.http-catchall.rule=hostregexp(`{host:.+}`)"
- "traefik.http.routers.http-catchall.entrypoints=web"
- "traefik.http.routers.http-catchall.middlewares=redirect-to-https"
- "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"
- "traefik.http.services.traefik.loadbalancer.server.port=8080"
- "homelab.category=dashboards"
- "homelab.description=Personal dashboard and service overview"
- "x-dockge.url=https://traefik.${DOMAIN}"
depends_on:
- duckdns
# Authelia - SSO authentication
# Protects all admin services with single sign-on
authelia:
image: authelia/authelia:4.37
image: authelia/authelia:latest
container_name: authelia
restart: unless-stopped
deploy:
resources:
limits:
cpus: '0.25' # Light CPU usage for auth
memory: 128M # Low memory usage
reservations:
cpus: '0.10'
memory: 64M
networks:
- traefik-network
volumes:
- ./authelia/configuration.yml:/config/configuration.yml:ro
- ./authelia/users_database.yml:/config/users_database.yml
- authelia-data:/data
environment:
- TZ=${TZ}
- AUTHELIA_JWT_SECRET=${AUTHELIA_JWT_SECRET}
- AUTHELIA_SESSION_SECRET=${AUTHELIA_SESSION_SECRET}
- AUTHELIA_STORAGE_ENCRYPTION_KEY=${AUTHELIA_STORAGE_ENCRYPTION_KEY}
labels:
- "traefik.enable=true"
- "traefik.http.routers.authelia.rule=Host(`auth.${DOMAIN}`)"
- "traefik.http.routers.authelia.entrypoints=websecure"
- "traefik.http.routers.authelia.tls=true"
- "traefik.http.services.authelia.loadbalancer.server.port=9091"
# Authelia middleware for other services
- "traefik.http.middlewares.authelia.forwardauth.address=http://authelia:9091/api/verify?rd=https://auth.${DOMAIN}"
- "traefik.http.middlewares.authelia.forwardauth.trustForwardHeader=true"
- "traefik.http.middlewares.authelia.forwardauth.authResponseHeaders=Remote-User,Remote-Groups,Remote-Name,Remote-Email"
- "x-dockge.url=https://auth.${DOMAIN}"
volumes:
- ./authelia/config:/config
- ./authelia/secrets:/secrets
networks:
- traefik-network
depends_on:
- traefik
labels:
- traefik.enable=true
- traefik.http.routers.authelia.rule=Host(`auth.${DOMAIN}`)
- traefik.http.routers.authelia.entrypoints=websecure
- traefik.http.routers.authelia.tls.certresolver=letsencrypt
- traefik.http.routers.authelia.service=authelia
- traefik.http.services.authelia.loadbalancer.server.port=9091
- traefik.http.middlewares.authelia.forwardauth.address=http://authelia:9091/api/verify?rd=https://auth.${DOMAIN}/
- traefik.http.middlewares.authelia.forwardauth.authResponseHeaders=X-Secret
- traefik.http.middlewares.authelia.forwardauth.trustForwardHeader=true
- x-dockge.url=https://auth.${DOMAIN}
dockerproxy:
image: tecnativa/docker-socket-proxy:latest
container_name: dockerproxy
privileged: true
restart: unless-stopped
ports:
- 2375:2375
volumes:
authelia-data:
driver: local
- /var/run/docker.sock:/var/run/docker.sock:ro
environment:
- CONTAINERS=1
- SERVICES=1
- TASKS=1
- NETWORKS=1
- NODES=1
labels:
- homelab.category=infrastructure
- homelab.description=Docker socket proxy for security
# Sablier - Lazy loading service for Docker containers
sablier-service:
image: sablierapp/sablier:latest
container_name: sablier-service
restart: unless-stopped
networks:
- traefik-network
environment:
- SABLIER_PROVIDER=docker
- SABLIER_DOCKER_API_VERSION=1.53
- SABLIER_DOCKER_NETWORK=traefik-network
- SABLIER_LOG_LEVEL=debug
- DOCKER_HOST=tcp://192.168.4.11:2375
ports:
- 10000:10000
labels:
- homelab.category=infrastructure
- homelab.description=Lazy loading service for Docker containers
networks:
traefik-network:

View File

@@ -1,31 +0,0 @@
# Traefik Dynamic Configuration
# Copy to /opt/stacks/traefik/dynamic/routes.yml
# Add custom routes here that aren't defined via Docker labels
http:
routers:
# Example custom route
# custom-service:
# rule: "Host(`custom.example.com`)"
# entryPoints:
# - websecure
# middlewares:
# - authelia@docker
# tls:
# certResolver: letsencrypt
# service: custom-service
services:
# Example custom service
# custom-service:
# loadBalancer:
# servers:
# - url: "http://192.168.1.100:8080"
middlewares:
# Additional middlewares can be defined here
# Example: Rate limiting
# rate-limit:
# rateLimit:
# average: 100
# burst: 50

View File

@@ -80,7 +80,6 @@ services:
- "traefik.http.routers.homarr.middlewares=authelia@docker"
- "traefik.http.services.homarr.loadbalancer.server.port=7575"
- "x-dockge.url=https://homarr.${DOMAIN}"
- "x-dockge.url=https://homarr.${DOMAIN}"
networks:
homelab-network:

View File

@@ -0,0 +1,117 @@
# On Demand Remote Services with Authelia, Sablier & Traefik
## 4 Step Process
1. Add route & service in Traefik external hosts file
2. Add middleware in Sablier config file (sablier.yml)
3. Add labels to compose files on Remote Host
4. Restart services
## Required Information
```bash
<server> - the hostname of the remote server
<service> - the application/container name
<full domain> - the base url for your proxy host (my-subdomain.duckdns.org)
<ip address> - the ip address of the remote server
<port> - the external port exposed by the service
<service display name> - how it will appear on the now loading page
<group name> - use <service name> for a single service, or something descriptive for the group of services that will start together.
```
## Step 1: Add route & service in Traefik external hosts file
### In /opt/stacks/core/traefik/dynamic/external-host-server_name.yml
```yaml
http:
routers:
# Add a section under routers for each Route (Proxy Host)
<service>-<server>:
rule: "Host(`<service>.<full domain>`)"
entryPoints:
- websecure
service: <service>-<server>
tls:
certResolver: letsencrypt
middlewares:
- sablier-<server>-<service>@file
- authelia@docker # comment this line to disable SSO login
# next route goes here
# All Routes go above this line
# Services section defines each service used above
services:
<service>-<server>:
loadBalancer:
servers:
- url: "http://<ip address>:<port>"
passHostHeader: true
# next service goes here
```
## Step 2: Add middlware to sablier config file
### In /opt/stacks/core/traefik/dynamic/sablier.yml
```yaml
http:
middlwares:
# Add a section under middlewares for each Route (Proxy Host)
sablier-<server>-<service>:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: <server>-<group name>
sessionDuration: 2m # Increase this for convience
ignoreUserAgent: curl # Don't wake the service for a curl command
dynamic:
displayName: <service display name>
theme: ghost # This can be changed
show-details-by-default: true # May want to disable for production
# Next middleware goes here
```
## Step 3: Add labels to compose files on Remote Host
## On the Remote Server
### Apply lables to the services in the compose files
```yaml
labels:
- sablier.enable=true
- sablier.group=<server>-<group name>
- sablier.start-on-demand=true
```
>**Note**:
Traefik & Authelia labels are not used in the compose file for Remote Hosts
## Step 4: Restart services
### On host system
```bash
docker restart traefik
docker restart sablier-service
```
### On the Remote Host
```bash
cd /opt/stacks/<service>
docker compose down && docker compose up -d
docker stop <service>
```
## Setup Complete
Access your service by the proxy url.

182
docs/README.md Normal file
View File

@@ -0,0 +1,182 @@
# AI-Homelab Documentation
Welcome to the AI-Homelab documentation! This is your comprehensive guide to deploying and managing a production-ready homelab infrastructure with 70+ pre-configured services.
## 📚 Documentation Structure
### 🚀 Getting Started
- **[Quick Start Guide](getting-started.md)** - Step-by-step setup for new users
- **[Prerequisites & Requirements](getting-started.md#prerequisites)** - What you need before starting
- **[First Deployment](getting-started.md#simple-setup)** - Automated setup process
### 🏗️ Architecture & Design
- **[System Architecture](README.md#key-features)** - High-level overview of components
- **[Network Architecture](docker-guidelines.md#network-architecture)** - How services communicate
- **[Security Model](docker-guidelines.md#security-best-practices)** - Authentication, SSL, and access control
- **[Storage Strategy](docker-guidelines.md#volume-management)** - Data persistence and organization
### 💾 Backup & Recovery
- **[Backup Strategy](Restic-BackRest-Backup-Guide.md)** - Comprehensive Restic + Backrest guide (default strategy)
- **[Backrest Service](service-docs/backrest.md)** - Web UI for backup management
### 📦 Services & Stacks
#### Core Infrastructure (Deploy First)
Essential services that everything else depends on:
- **[DuckDNS](service-docs/duckdns.md)** - Dynamic DNS updates
- **[Traefik](service-docs/traefik.md)** - Reverse proxy & SSL termination
- **[Authelia](service-docs/authelia.md)** - Single Sign-On authentication
- **[Gluetun](service-docs/gluetun.md)** - VPN client for secure downloads
- **[Sablier](service-docs/sablier.md)** - Lazy loading service for on-demand containers
#### Management & Monitoring
- **[Dockge](service-docs/dockge.md)** - Primary stack management UI
- **[Homepage](service-docs/homepage.md)** - Service dashboard (AI-configurable)
- **[Homarr](service-docs/homarr.md)** - Alternative modern dashboard
- **[Dozzle](service-docs/dozzle.md)** - Real-time log viewer
- **[Glances](service-docs/glances.md)** - System monitoring
- **[Pi-hole](service-docs/pihole.md)** - DNS & ad blocking
#### Media Services
- **[Jellyfin](service-docs/jellyfin.md)** - Open-source media streaming
- **[Plex](service-docs/plex.md)** - Popular media server (alternative)
- **[qBittorrent](service-docs/qbittorrent.md)** - Torrent client (VPN-routed)
- **[Calibre-Web](service-docs/calibre-web.md)** - Ebook reader & server
#### Media Management (Arr Stack)
- **[Sonarr](service-docs/sonarr.md)** - TV show automation
- **[Radarr](service-docs/radarr.md)** - Movie automation
- **[Prowlarr](service-docs/prowlarr.md)** - Indexer management
- **[Readarr](service-docs/readarr.md)** - Ebook/audiobook automation
- **[Lidarr](service-docs/lidarr.md)** - Music library management
- **[Bazarr](service-docs/bazarr.md)** - Subtitle automation
- **[Jellyseerr](service-docs/jellyseerr.md)** - Media request interface
#### Home Automation
- **[Home Assistant](service-docs/home-assistant.md)** - Smart home platform
- **[Node-RED](service-docs/node-red.md)** - Flow-based programming
- **[Zigbee2MQTT](service-docs/zigbee2mqtt.md)** - Zigbee device integration
- **[ESPHome](service-docs/esphome.md)** - ESP device firmware
- **[TasmoAdmin](service-docs/tasmoadmin.md)** - Tasmota device management
- **[MotionEye](service-docs/motioneye.md)** - Video surveillance
#### Productivity & Collaboration
- **[Nextcloud](service-docs/nextcloud.md)** - Self-hosted cloud storage
- **[Gitea](service-docs/gitea.md)** - Git service (GitHub alternative)
- **[BookStack](service-docs/bookstack.md)** - Documentation/wiki platform
- **[WordPress](service-docs/wordpress.md)** - Blog/CMS platform
- **[MediaWiki](service-docs/mediawiki.md)** - Wiki platform
- **[DokuWiki](service-docs/dokuwiki.md)** - Simple wiki
- **[Excalidraw](service-docs/excalidraw.md)** - Collaborative drawing
#### Development Tools
- **[Code Server](service-docs/code-server.md)** - VS Code in the browser
- **[GitLab](service-docs/gitlab.md)** - Complete DevOps platform
- **[Jupyter](service-docs/jupyter.md)** - Interactive computing
- **[pgAdmin](service-docs/pgadmin.md)** - PostgreSQL administration
#### Monitoring & Observability
- **[Grafana](service-docs/grafana.md)** - Metrics visualization
- **[Prometheus](service-docs/prometheus.md)** - Metrics collection
- **[Uptime Kuma](service-docs/uptime-kuma.md)** - Uptime monitoring
- **[Loki](service-docs/loki.md)** - Log aggregation
- **[Promtail](service-docs/promtail.md)** - Log shipping
- **[Node Exporter](service-docs/node-exporter.md)** - System metrics
- **[cAdvisor](service-docs/cadvisor.md)** - Container metrics
#### Utilities & Tools
- **[Backrest](service-docs/backrest.md)** - Backup management (Restic-based, default)
- **[Duplicati](service-docs/duplicati.md)** - Alternative backup solution
- **[FreshRSS](service-docs/freshrss.md)** - RSS feed reader
- **[Wallabag](service-docs/wallabag.md)** - Read-it-later service
- **[Watchtower](service-docs/watchtower.md)** - Automatic updates
- **[Vaultwarden](service-docs/vaultwarden.md)** - Password manager
#### Alternative Services
Services that provide alternatives to the defaults:
- **[Portainer](service-docs/portainer.md)** - Alternative container management
- **[Authentik](service-docs/authentik.md)** - Alternative SSO with web UI
### 🛠️ Development & Operations
#### Docker & Container Management
- **[Docker Guidelines](docker-guidelines.md)** - Complete service management guide
- **[Service Creation](docker-guidelines.md#service-creation-guidelines)** - How to add new services
- **[Service Modification](docker-guidelines.md#service-modification-guidelines)** - Updating existing services
- **[Resource Limits](resource-limits-template.md)** - CPU/memory management
- **[Troubleshooting](docker-guidelines.md#troubleshooting)** - Common issues & fixes
#### External Service Integration
- **[Proxying External Hosts](proxying-external-hosts.md)** - Route non-Docker services through Traefik
- **[External Host Examples](proxying-external-hosts.md#common-external-services-to-proxy)** - Raspberry Pi, NAS, etc.
#### AI & Automation
- **[Copilot Instructions](.github/copilot-instructions.md)** - AI agent guidelines for this codebase
- **[AI Management Capabilities](.github/copilot-instructions.md#ai-management-capabilities)** - What the AI can help with
### 📋 Quick References
#### Commands & Operations
- **[Quick Reference](quick-reference.md)** - Essential commands and workflows
- **[Stack Management](quick-reference.md#service-management)** - Start/stop/restart services
- **[Deployment Scripts](quick-reference.md#deployment-scripts)** - Setup and deployment automation
#### Troubleshooting
- **[Common Issues](quick-reference.md#troubleshooting)** - SSL, networking, permissions
- **[Service Won't Start](quick-reference.md#service-wont-start)** - Debugging steps
- **[Traefik Routing](quick-reference.md#traefik-not-routing)** - Route configuration issues
- **[VPN Problems](quick-reference.md#vpn-not-working-gluetun)** - Gluetun troubleshooting
### 📖 Advanced Topics
#### SSL & Certificates
- **[Wildcard SSL Setup](getting-started.md#notes-about-ssl-certificates-from-letsencrypt-with-duckdns)** - How SSL certificates work
- **[Certificate Troubleshooting](getting-started.md#certificate-troubleshooting)** - SSL issues and fixes
- **[DNS Challenge Process](getting-started.md#dns-challenge-process)** - How domain validation works
#### Security & Access Control
- **[Authelia Configuration](service-docs/authelia.md)** - SSO setup and customization
- **[Bypass Rules](docker-guidelines.md#when-to-use-authelia-sso)** - When to skip authentication
- **[2FA Setup](getting-started.md#set-up-2fa-with-authelia)** - Two-factor authentication
#### Backup & Recovery
- **[Backup Strategies](service-docs/duplicati.md)** - Data protection approaches
- **[Service Backups](service-docs/backrest.md)** - Database backup solutions
- **[Configuration Backup](quick-reference.md#backup-commands)** - Config file preservation
### 🔧 Development & Contributing
#### Repository Structure
- **[File Organization](.github/copilot-instructions.md#file-structure-standards)** - How files are organized
- **[Service Documentation](service-docs/)** - Individual service guides
- **[Configuration Templates](config-templates/)** - Reusable configurations
- **[Scripts](scripts/)** - Automation and deployment tools
#### Development Workflow
- **[Adding Services](docker-guidelines.md#service-creation-guidelines)** - New service integration
- **[Testing Changes](.github/copilot-instructions.md#testing-changes)** - Validation procedures
- **[Resource Limits](resource-limits-template.md)** - Performance management
### 📚 Additional Resources
- **[GitHub Repository](https://github.com/kelinfoxy/AI-Homelab)** - Source code and issues
- **[Docker Hub](https://hub.docker.com)** - Container images
- **[Traefik Documentation](https://doc.traefik.io/traefik/)** - Official reverse proxy docs
- **[Authelia Documentation](https://www.authelia.com/)** - SSO documentation
- **[DuckDNS](https://www.duckdns.org/)** - Dynamic DNS service
---
## 🎯 Quick Navigation
**New to AI-Homelab?** → [Getting Started](getting-started.md)
**Need to add a service?** → [Service Creation Guide](docker-guidelines.md#service-creation-guidelines)
**Having issues?** → [Troubleshooting](quick-reference.md#troubleshooting)
**Want to contribute?** → [Development Workflow](docker-guidelines.md#service-creation-guidelines)
---
*This documentation is maintained by AI and community contributors. Last updated: January 20, 2026*

View File

@@ -0,0 +1,196 @@
# Comprehensive Backup Guide: Restic + Backrest
This guide covers your configured setup with **Restic** (the backup engine) and **Backrest** (the web UI for managing Restic). It includes current info (as of January 2026), configurations, examples, and best practices. Your setup backs up Docker volumes, service configs, and databases with automated stop/start hooks.
## Overview
### What is Restic?
Restic (latest: v0.18.1) is a modern, open-source backup program written in Go. It provides:
- **Deduplication**: Efficiently stores only changed data.
- **Encryption**: All data is encrypted with AES-256.
- **Snapshots**: Point-in-time backups with metadata.
- **Cross-Platform**: Works on Linux, macOS, Windows.
- **Backends**: Supports local, SFTP, S3, etc.
- **Features**: Compression, locking, pruning, mounting snapshots.
Restic is command-line based; Backrest provides a web UI.
### What is Backrest?
Backrest (latest: v1.10.1) is a web-based UI for Restic, built with Go and SvelteKit. It simplifies Restic management:
- **Web Interface**: Create repos, plans, and monitor backups.
- **Automation**: Scheduled backups, hooks (pre/post commands).
- **Integration**: Runs Restic under the hood.
- **Features**: Multi-repo support, retention policies, notifications.
Your Backrest instance is containerized, with access to Docker volumes and host paths.
## Your Current Setup
### Backrest Configuration
- **Container**: `garethgeorge/backrest:latest` on port 9898.
- **Mounts**:
- `/var/lib/docker/volumes:/docker_volumes` (all Docker volumes).
- `/opt/stacks:/opt/stacks` (service configs and data).
- `/var/run/docker.sock` (for Docker commands in hooks).
- **Environment**:
- `BACKREST_DATA=/data` (internal data).
- `BACKREST_CONFIG=/config/config.json` (plans/repos).
- `BACKREST_UI_CONFIG={"baseURL": "https://backrest.kelin-casa.duckdns.org"}` (UI base URL).
- **Repos**: `jarvis-restic-repo` (your main repo).
- **Plans**:
- **jarvis-database-backup**: Backs up `_mysql` volumes with DB stop/start hooks.
- **Other Plans**: For volumes/configs (e.g., individual paths like `/docker_volumes/gitea_data/_data`).
### Key Features in Use
- **Hooks**: Pre-backup stops DBs, post-backup starts them.
- **Retention**: Keep last 30 snapshots.
- **Schedule**: Daily backups.
- **Paths**: Selective (e.g., DB volumes, service data).
## Step-by-Step Guide
### 1. Accessing Backrest
- URL: `https://backrest.kelin-casa.duckdns.org`
- Auth: Via Authelia (as configured in NPM).
- UI Sections: Repos, Plans, Logs.
### 2. Managing Repos
Repos store backups. Yours is `jarvis-restic-repo`.
#### Create a New Repo
1. Go to **Repos** > **Add Repo**.
2. **Name**: `my-new-repo`.
3. **Storage**: Choose backend (e.g., Local: `/data/repos/my-new-repo`).
4. **Password**: Set a strong password (Restic encrypts with this).
5. **Initialize**: Backrest runs `restic init`.
#### Example Config (JSON)
```json
{
"id": "jarvis-restic-repo",
"uri": "/repos/jarvis-restic-repo",
"password": "your-secure-password",
"env": {},
"flags": []
}
```
#### Best Practices
- Use strong passwords.
- For remote: Use SFTP/S3 for offsite backups.
- Test access: `restic list snapshots --repo /repos/repo-name`.
### 3. Creating Backup Plans
Plans define what/when/how to back up.
#### Your DB Plan Example
```json
{
"id": "jarvis-database-backup",
"repo": "jarvis-restic-repo",
"paths": [
"/docker_volumes/bookstack_mysql/_data",
"/docker_volumes/gitea_mysql/_data",
"/docker_volumes/mediawiki_mysql/_data",
"/docker_volumes/nextcloud_mysql/_data",
"/docker_volumes/formio_mysql/_data"
],
"excludes": [],
"iexcludes": [],
"schedule": {
"clock": "CLOCK_LOCAL",
"maxFrequencyDays": 1
},
"backup_flags": [],
"hooks": [
{
"actionCommand": {
"command": "for vol in $(docker volume ls -q | grep '_mysql$'); do docker ps -q --filter volume=$vol | xargs -r docker stop || true; done"
},
"conditions": ["CONDITION_SNAPSHOT_START"],
"onError": "ON_ERROR_CANCEL"
},
{
"actionCommand": {
"command": "for vol in $(docker volume ls -q | grep '_mysql$'); do docker ps -a -q --filter volume=$vol | xargs -r docker start || true; done"
},
"conditions": ["CONDITION_SNAPSHOT_END"],
"onError": "ON_ERROR_RETRY_1MINUTE"
}
],
"retention": {
"policyKeepLastN": 30
}
}
```
#### Create a New Plan
1. Go to **Plans** > **Add Plan**.
2. **Repo**: Select `jarvis-restic-repo`.
3. **Paths**: Add directories (e.g., `/opt/stacks/homepage/config`).
4. **Schedule**: Set frequency (e.g., daily).
5. **Hooks**: Add pre/post commands (e.g., for non-DB backups).
6. **Retention**: Keep last N snapshots.
7. **Save & Run**: Test with **Run Backup Now**.
#### Example: Volume Backup Plan
```json
{
"id": "volumes-backup",
"repo": "jarvis-restic-repo",
"paths": [
"/docker_volumes/gitea_data/_data",
"/docker_volumes/nextcloud_html/_data"
],
"schedule": {"maxFrequencyDays": 1},
"retention": {"policyKeepLastN": 14}
}
```
### 4. Running & Monitoring Backups
- **Manual**: Plans > Select plan > **Run Backup**.
- **Scheduled**: Runs automatically per schedule.
- **Logs**: Check **Logs** tab for output/errors.
- **Status**: View snapshots in repo details.
#### Restic Commands (via Backrest)
Backrest runs Restic under the hood. Examples:
- Backup: `restic backup /path --repo /repo`
- List: `restic snapshots --repo /repo`
- Restore: `restic restore latest --repo /repo --target /restore/path`
- Prune: `restic forget --keep-last 30 --repo /repo`
### 5. Restoring Data
1. Go to **Repos** > Select repo > **Snapshots**.
2. Choose snapshot > **Restore**.
3. Select paths/files > Target directory.
4. Run restore.
#### Example Restore Command
```bash
restic restore latest --repo /repos/jarvis-restic-repo --target /tmp/restore --path /docker_volumes/bookstack_mysql/_data
```
### 6. Advanced Features
- **Excludes**: Add glob patterns (e.g., `*.log`) to skip files.
- **Compression**: Enable in backup flags: `--compression max`.
- **Notifications**: Configure webhooks in Backrest settings.
- **Mounting**: `restic mount /repo /mnt/backup` to browse snapshots.
- **Forget/Prune**: Auto-managed via retention, or manual: `restic forget --keep-daily 7 --repo /repo`.
### 7. Best Practices
- **Security**: Use strong repo passwords. Limit Backrest access.
- **Testing**: Regularly test restores.
- **Retention**: Balance storage (e.g., keep 30 daily).
- **Monitoring**: Check logs for failures.
- **Offsite**: Add a remote repo for disaster recovery.
- **Performance**: Schedule during low-usage times.
- **Updates**: Keep Restic/Backrest updated (current versions noted).
### 8. Troubleshooting
- **Hook Failures**: Check Docker socket access; ensure CLI installed.
- **Permissions**: Bind mounts may need `chown` to match container user.
- **Space**: Monitor repo size; prune old snapshots.
- **Errors**: Logs show Restic output; search for "exit status" codes.
This covers your setup. For more, visit [Restic Docs](https://restic.net/) and [Backrest GitHub](https://github.com/garethgeorge/backrest). Let me know if you need help with specific configs!

60
docs/core-stack-readme.md Normal file
View File

@@ -0,0 +1,60 @@
# Core Stack
## The core stack contains the critical infrastructure for a homelab.
>There are always other options.
I chose these for easy of use and file based configuration (so AI can do it all)
>For a multi server homelab, only install core services on one server.
* DuckDNS with LetsEncrypt- Free subdomains and wildcard SSL Certificates
* Authelia - Single Sign On (SSO) Authentication with optional 2 Factor Authentication
* Traefik - Proxy Host Routing to access your services through your duckdns url
* Sablier - Both a container and a Traefik plugin, enables ondemand services (saving resources and power)
## DuckDNS with LetsEncrypt
Get your free subdomain at http://www.duckdns.org
>Copy your DuckDNS Token, you'll need to add that to the .env file
DuckDNS service will keep your subdomain pointing to your IP address, even if it changes.
LetsEncrypt service will generate 2 certificates, one for your duckdns subdomain, and one wildcard for the same domain.
The wildcard certificate is used by all the services. No need for an individual certificate per service.
Certificates will auto renew. This is a set it and forget service, no webui, no changes needed when adding a new service.
>It can take 2-5 minutes for the certificates to be generated and applied. Until then it will use a self signed certificate. You just need to accept the browser warnings. Once it's applied the browser warnings will go away.
## Authelia
Provides Single Sign On (SSO) Authentication for your services.
>Some services you may not want behind SSO, like Jellyfin/Plex if you watch your media through an app on a smart TV, firestick or phone.
* Optional 2 Factor Authentication
* Easy to enable/disable - Comment out 1 line in the compose file then bring the stack down and back up to disable (not restart)
* Configured on a per service basis
* A public service (like wordpress) can bypass authelia login and let the service handle it's own authentication
* Admin services (like dockge/portainer) can use 2 Factor Authorization for an extra layer of security
## Traefik
Provides proxy routing for your services so you can access them at a url like wordpress.my-subdoain.duckdns.org
For services on a Remote Host, create a file in the traefik/dynamic folder like my-server-external-host.yml
Create 1 file per remote host.
## Sablier
Provides ondemand container management (start/pause/stop)
>Important: The stack must be up and the service stopped for Sablier to work.
Tip: If you have a lot of services, use a script to get the services to that state.
If a service is down and anything requests the url for that service, Sablier will start the service and redirect after a short delay (however long it takes the service to come up), usually less than 20 seconds.
Once the set inactivity period has elapsed, Sablier will pause or stop the container according to the settings.
This saves resources and electricity. Allowing you have more services installed, configured, and ready to use even if the server is incapable of running all the services simultaniously. Great for single board computers, mini PCs, low resource systems, and your power bill.

View File

@@ -100,7 +100,7 @@ AI will suggest `/mnt/` when data may exceed 50GB or grow continuously.
## Traefik and Authelia Integration
### Every Service Needs Traefik Labels
### Every Local (on the same server) Service Needs Traefik Labels
Standard pattern for all services:
@@ -125,13 +125,25 @@ services:
# Enable Let's Encrypt
- "traefik.http.routers.myservice.tls.certresolver=letsencrypt"
# Add Authelia SSO (if needed)
# Add Authelia SSO (if needed) - comment out to disable SSO
- "traefik.http.routers.myservice.middlewares=authelia@docker"
# Specify port (if not default 80)
- "traefik.http.services.myservice.loadbalancer.server.port=8080"
# Optional: Sablier lazy loading (comment out to disable)
# - "sablier.enable=true"
# - "sablier.group=core-myservice"
# - "sablier.start-on-demand=true"
```
### If Traefik is on a Remote Server, configure routes & services on the Remote Server
Add a yaml file to the traefik/dynamic folder for each remote server
Add a section under routers: and a section on services: for each service
### When to Use Authelia SSO
**Protect with Authelia**:
@@ -414,6 +426,46 @@ mv docker-compose/service.yml.backup docker-compose/service.yml
docker compose -f docker-compose/service.yml up -d
```
### Common Modifications
**Toggle SSO**: Comment/uncomment the Authelia middleware label:
```yaml
# Enable SSO
- "traefik.http.routers.service.middlewares=authelia@docker"
# Disable SSO (comment out)
# - "traefik.http.routers.service.middlewares=authelia@docker"
```
**Toggle Lazy Loading**: Comment/uncomment Sablier labels:
```yaml
# Enable lazy loading
- "sablier.enable=true"
- "sablier.group=core-service"
- "sablier.start-on-demand=true"
# Disable lazy loading (comment out)
# - "sablier.enable=true"
# - "sablier.group=core-service"
# - "sablier.start-on-demand=true"
```
**Change Port**: Update the loadbalancer server port:
```yaml
- "traefik.http.services.service.loadbalancer.server.port=8080"
```
**Add VPN Routing**: Change network mode and update Gluetun ports:
```yaml
network_mode: "service:gluetun"
# Add port mapping in Gluetun service
```
**Update Subdomain**: Modify the Host rule:
```yaml
- "traefik.http.routers.service.rule=Host(`newservice.${DOMAIN}`)"
```
## Naming Conventions
### Service Names

View File

@@ -1,17 +1,17 @@
# Getting Started Guide
Welcome to your AI-powered homelab! This guide will walk you through setting up your production-ready infrastructure with Dockge, Traefik, Authelia, and 60+ services.
Welcome to your AI-powered homelab! This guide will walk you through setting up your production-ready infrastructure with Dockge, Traefik, Authelia, and 70+ services.
## Getting Started Checklist
- [ ] Clone this repository to your home folder
- [ ] Configure `.env` file with your domain and tokens
- [ ] Run setup script (generates Authelia secrets and admin user)
- [ ] Configure `.env` file with your domain and tokens ([see prerequisites](#prerequisites))
- [ ] Run setup script (generates Authelia secrets and admin user) ([setup-homelab.sh](../scripts/setup-homelab.sh))
- [ ] Log out and back in for Docker group permissions
- [ ] Run deployment script (deploys all core, infrastructure & dashboard services)
- [ ] Access Dockge web UI
- [ ] Set up 2FA with Authelia
- [ ] (optional) Deploy additional stacks as needed via Dockge
- [ ] Configure and use VS Code with Github Copilot to manage the server
- [ ] Run deployment script (deploys all core, infrastructure & dashboard services) ([deploy-homelab.sh](../scripts/deploy-homelab.sh))
- [ ] Access Dockge web UI ([https://dockge.yourdomain.duckdns.org](https://dockge.yourdomain.duckdns.org))
- [ ] Set up 2FA with Authelia ([Authelia setup guide](service-docs/authelia.md))
- [ ] (optional) Deploy additional stacks as needed via Dockge ([services overview](services-overview.md))
- [ ] Configure and use VS Code with Github Copilot to manage the server ([AI management](.github/copilot-instructions.md))
## Quick Setup (Recommended)
@@ -35,7 +35,7 @@ For most users, the automated setup script handles everything:
3. **Clone the rep**:
```bash
git clone https://github.com/kelinfoxy/AI-Homelab.git
cd AI-Homeb
cd AI-Homelab
4. **Configure environment**:
```bash
cp .env.example .env
@@ -363,7 +363,7 @@ docker compose up -d --build service-name
## Next Steps
1. **Explore services** through Dockge
2. **Configure backups** with Backrest/Duplicati
2. **Set up backups** with Backrest (default Restic-based solution)
3. **Set up monitoring** with Grafana/Prometheus
4. **Add external services** via Traefik proxying
5. **Use AI assistance** for custom configurations

View File

@@ -29,6 +29,7 @@ For detailed information about the deployment scripts, their features, and usage
- `setup-homelab.sh` - First-run system setup and Authelia configuration
- `deploy-homelab.sh` - Deploy all core services and prepare additional stacks
- `reset-test-environment.sh` - Testing/development only - removes all deployed services
- `reset-ondemand-services.sh` - Reload services for Sablier lazy loading
## Common Commands
@@ -182,6 +183,7 @@ docker image prune -a
### Core Infrastructure (core.yml)
- **80/443**: Traefik (reverse proxy)
- **8080**: Traefik dashboard
- **10000**: Sablier (lazy loading service)
### Infrastructure Services (infrastructure.yml)
- **5001**: Dockge (stack manager)
@@ -304,16 +306,17 @@ After deployment, access services at:
Core Infrastructure:
https://traefik.${DOMAIN} - Traefik dashboard
https://auth.${DOMAIN} - Authelia login
http://sablier.${DOMAIN}:10000 - Sablier lazy loading (internal)
Infrastructure:
https://dockge.${DOMAIN} - Stack manager (PRIMARY)
https://portainer.${DOMAIN} - Docker UI (secondary)
http://${SERVER_IP}:8082 - Pi-hole admin
http://pihole.${DOMAIN} - Pi-hole admin
https://dozzle.${DOMAIN} - Log viewer
https://glances.${DOMAIN} - System monitor
Dashboards:
https://home.${DOMAIN} - Homepage dashboard
https://homepage.${DOMAIN} - Homepage dashboard
https://homarr.${DOMAIN} - Homarr dashboard
Media:
@@ -322,17 +325,19 @@ https://jellyfin.${DOMAIN} - Jellyfin (no auth)
https://sonarr.${DOMAIN} - TV automation
https://radarr.${DOMAIN} - Movie automation
https://prowlarr.${DOMAIN} - Indexer manager
https://qbit.${DOMAIN} - Torrent client
https://torrents.${DOMAIN} - Torrent client
Productivity:
https://nextcloud.${DOMAIN} - File sync
https://git.${DOMAIN} - Git service
https://docs.${DOMAIN} - Documentation
https://nextcloud.${DOMAIN} - Cloud Storage
https://gitea.${DOMAIN} - Gitea
Monitoring:
https://grafana.${DOMAIN} - Metrics dashboard
https://prometheus.${DOMAIN} - Metrics collection
https://status.${DOMAIN} - Uptime monitoring
Utilities:
https://backrest.${DOMAIN} - Backup management (Restic)
```
## Troubleshooting

View File

@@ -588,3 +588,15 @@ Authelia is your homelab's security layer. It:
- Integrates seamlessly with Traefik
Setting up Authelia properly is one of the most important security steps for your homelab. Take time to understand access control rules and test your configuration thoroughly. Always keep the `users_database.yml` and `db.sqlite3` backed up, as they contain critical authentication data.
## Related Services
- **[Traefik](traefik.md)** - Reverse proxy that integrates with Authelia for authentication
- **[DuckDNS](duckdns.md)** - Dynamic DNS required for SSL certificates
- **[Sablier](sablier.md)** - Lazy loading that can work with Authelia-protected services
## See Also
- **[SSO Bypass Guide](../docker-guidelines.md#when-to-use-authelia-sso)** - When to disable authentication for services
- **[2FA Setup](../getting-started.md#set-up-2fa-with-authelia)** - Enable two-factor authentication
- **[Access Control Rules](../docker-guidelines.md#authelia-bypass-example-jellyfin)** - Configure bypass rules for specific services

View File

@@ -1,151 +1,244 @@
# Backrest - Backup Solution
# Backrest - Comprehensive Backup Solution
## Table of Contents
- [Overview](#overview)
- [What is Backrest?](#what-is-backrest)
- [Why Use Backrest?](#why-use-backrest)
- [Configuration in AI-Homelab](#configuration-in-ai-homelab)
- [Official Resources](#official-resources)
- [Docker Configuration](#docker-configuration)
**Category:** Backup & Recovery
**Description:** Backrest is a web-based UI for Restic, providing scheduled backups, retention policies, and a beautiful interface for managing backups across multiple repositories and destinations. It serves as the default backup strategy for AI-Homelab.
**Docker Image:** `garethgeorge/backrest:latest`
**Documentation:** [Backrest GitHub](https://github.com/garethgeorge/backrest)
## Overview
**Category:** Backup & Recovery
**Docker Image:** [garethgeorge/backrest](https://hub.docker.com/r/garethgeorge/backrest)
**Default Stack:** `utilities.yml`
**Web UI:** `http://SERVER_IP:9898`
**Backend:** Restic
**Ports:** 9898
### What is Backrest?
Backrest (latest: v1.10.1) is a web-based UI for Restic, built with Go and SvelteKit. It simplifies Restic management:
- **Web Interface**: Create repos, plans, and monitor backups.
- **Automation**: Scheduled backups, hooks (pre/post commands).
- **Integration**: Runs Restic under the hood.
- **Features**: Multi-repo support, retention policies, notifications.
## What is Backrest?
### What is Restic?
Restic (latest: v0.18.1) is a modern, open-source backup program written in Go. It provides:
- **Deduplication**: Efficiently stores only changed data.
- **Encryption**: All data is encrypted with AES-256.
- **Snapshots**: Point-in-time backups with metadata.
- **Cross-Platform**: Works on Linux, macOS, Windows.
- **Backends**: Supports local, SFTP, S3, etc.
- **Features**: Compression, locking, pruning, mounting snapshots.
Backrest is a web UI and orchestration layer for Restic, a powerful backup tool. It provides scheduled backups, retention policies, and a beautiful interface for managing backups across multiple repositories and destinations. Think of it as a user-friendly wrapper around Restic's power.
## Configuration
### Key Features
- **Web Interface:** Manage backups visually
- **Multiple Repos:** Backup to different locations
- **Schedules:** Cron-based automatic backups
- **Retention Policies:** Keep last N backups
- **Compression:** Automatic compression
- **Deduplication:** Save storage space
- **Encryption:** AES-256 encryption
- **Destinations:** Local, S3, B2, SFTP, WebDAV
- **Notifications:** Alerts on failure
- **Browse & Restore:** Visual file restoration
### Environment Variables
## Why Use Backrest?
| Variable | Description | Default |
|----------|-------------|---------|
| `BACKREST_DATA` | Internal data directory | `/data` |
| `BACKREST_CONFIG` | Configuration file path | `/config/config.json` |
| `BACKREST_UI_CONFIG` | UI configuration JSON | `{"baseURL": "https://backrest.${DOMAIN}"}` |
1. **Easy Backups:** Simple web interface
2. **Restic Power:** Proven backup engine
3. **Automated:** Set and forget
4. **Multiple Destinations:** Flexibility
5. **Encrypted:** Secure backups
6. **Deduplicated:** Efficient storage
7. **Free & Open Source:** No cost
### Ports
## Configuration in AI-Homelab
- **9898** - Web UI port
```
/opt/stacks/utilities/backrest/data/ # Backrest config
/opt/stacks/utilities/backrest/cache/ # Restic cache
### Volumes
- `./data:/data` - Backrest internal data and repositories
- `./config:/config` - Configuration files
- `./cache:/cache` - Restic cache for performance
- `/var/lib/docker/volumes:/docker_volumes:ro` - Access to Docker volumes
- `/opt/stacks:/opt/stacks:ro` - Access to service configurations
- `/var/run/docker.sock:/var/run/docker.sock` - Docker API access for hooks
## Usage
### Accessing Backrest
- **URL**: `https://backrest.${DOMAIN}`
- **Authentication**: Via Authelia SSO
- **UI Sections**: Repos, Plans, Logs
### Managing Repositories
Repositories store your backups. Create one for your main backup location.
#### Create Repository
1. Go to **Repos****Add Repo**
2. **Name**: `main-backup-repo`
3. **Storage**: Choose backend (Local, SFTP, S3, etc.)
4. **Password**: Set strong encryption password
5. **Initialize**: Backrest runs `restic init`
### Creating Backup Plans
Plans define what, when, and how to back up.
#### Database Backup Plan (Recommended)
```json
{
"id": "database-backup",
"repo": "main-backup-repo",
"paths": [
"/docker_volumes/*_mysql/_data",
"/docker_volumes/*_postgres/_data"
],
"schedule": {
"maxFrequencyDays": 1
},
"hooks": [
{
"actionCommand": {
"command": "for vol in $(docker volume ls -q | grep '_mysql$'); do docker ps -q --filter volume=$vol | xargs -r docker stop || true; done"
},
"conditions": ["CONDITION_SNAPSHOT_START"]
},
{
"actionCommand": {
"command": "for vol in $(docker volume ls -q | grep '_mysql$'); do docker ps -a -q --filter volume=$vol | xargs -r docker start || true; done"
},
"conditions": ["CONDITION_SNAPSHOT_END"]
}
],
"retention": {
"policyKeepLastN": 30
}
}
```
## Official Resources
#### Service Configuration Backup Plan
```json
{
"id": "config-backup",
"repo": "main-backup-repo",
"paths": [
"/opt/stacks"
],
"excludes": [
"**/cache",
"**/tmp",
"**/log"
],
"schedule": {
"maxFrequencyDays": 1
},
"retention": {
"policyKeepLastN": 14
}
}
```
- **GitHub:** https://github.com/garethgeorge/backrest
- **Restic:** https://restic.net
### Running Backups
- **Manual**: Plans → Select plan → **Run Backup Now**
- **Scheduled**: Runs automatically per plan schedule
- **Monitor**: Check **Logs** tab for status and errors
## Docker Configuration
### Restoring Data
1. Go to **Repos** → Select repo → **Snapshots**
2. Choose snapshot → **Restore**
3. Select paths/files → Set target directory
4. Run restore operation
## Best Practices
### Security
- Use strong repository passwords
- Limit Backrest UI access via Authelia
- Store passwords securely (not in config files)
### Performance
- Schedule backups during low-usage hours
- Use compression for large backups
- Monitor repository size growth
### Retention
- Keep 30 daily snapshots for critical data
- Keep 14 snapshots for configurations
- Regularly prune old snapshots
### Testing
- Test restore procedures regularly
- Verify backup integrity
- Document restore processes
## Integration with AI-Homelab
### Homepage Dashboard
Add Backrest to your Homepage dashboard:
```yaml
backrest:
image: garethgeorge/backrest:latest
container_name: backrest
restart: unless-stopped
networks:
- traefik-network
ports:
- "9898:9898"
environment:
- BACKREST_DATA=/data
- BACKREST_CONFIG=/config/config.json
- XDG_CACHE_HOME=/cache
volumes:
- /opt/stacks/utilities/backrest/data:/data
- /opt/stacks/utilities/backrest/config:/config
- /opt/stacks/utilities/backrest/cache:/cache
- /opt/stacks:/backup-source:ro # What to backup
labels:
- "traefik.enable=true"
- "traefik.http.routers.backrest.rule=Host(`backrest.${DOMAIN}`)"
# In homepage/services.yaml
- Infrastructure:
- Backrest:
icon: backup.png
href: https://backrest.${DOMAIN}
description: Backup management
widget:
type: iframe
url: https://backrest.${DOMAIN}
```
## Setup
### Monitoring
Monitor backup success with Uptime Kuma or Grafana alerts.
1. **Start Container:**
## Troubleshooting
### Common Issues
**Backup Failures**
- Check repository access and credentials
- Verify source paths exist and are readable
- Review hook commands for syntax errors
**Hook Issues**
- Ensure Docker socket is accessible
- Check that containers can be stopped/started
- Verify hook commands work manually
**Performance Problems**
- Check available disk space
- Monitor CPU/memory usage during backups
- Consider excluding large, frequently changing files
**Restore Issues**
- Ensure target directory exists and is writable
- Check file permissions
- Verify snapshot integrity
## Advanced Features
### Multiple Repositories
- **Local**: For fast, local backups
- **Remote**: SFTP/S3 for offsite storage
- **Hybrid**: Local for speed, remote for safety
### Custom Hooks
```bash
docker compose up -d backrest
# Pre-backup: Stop services
docker compose -f /opt/stacks/core/docker-compose.yml stop
# Post-backup: Start services
docker compose -f /opt/stacks/core/docker-compose.yml start
```
2. **Access UI:** `http://SERVER_IP:9898`
### Notifications
Configure webhooks in Backrest settings for backup status alerts.
3. **Create Repository:**
- Add Repository
- Location: Local, S3, B2, etc.
- Encryption password
- Initialize repository
## Migration from Other Solutions
4. **Create Plan:**
- Add backup plan
- Source: `/backup-source` (mounted volume)
- Repository: Select created repo
- Schedule: `0 2 * * *` (2 AM daily)
- Retention: Keep last 7 daily, 4 weekly, 12 monthly
### From Duplicati
1. Export Duplicati configurations
2. Create equivalent Backrest plans
3. Test backups and restores
4. Decommission Duplicati
5. **Run Backup:**
- Manual: Click "Backup Now"
- Or wait for schedule
### From Manual Scripts
1. Identify current backup sources and schedules
2. Create Backrest plans with same parameters
3. Add appropriate hooks for service management
4. Test and validate
6. **Restore:**
- Browse backups
- Select snapshot
- Browse files
- Restore to location
## Related Documentation
## Summary
- **[Backup Strategy Guide](../Restic-BackRest-Backup-Guide.md)** - Comprehensive setup and usage guide
- **[Docker Guidelines](../docker-guidelines.md)** - Volume management and persistence
- **[Quick Reference](../quick-reference.md)** - Command reference and troubleshooting
Backrest provides web-based backup management offering:
- Visual Restic interface
- Scheduled automated backups
- Multiple backup destinations
- Retention policies
- Encryption and deduplication
- Easy restore
- Free and open-source
---
**Perfect for:**
- Homelab backups
- Docker volume backups
- Off-site backup management
- Automated backup schedules
- Visual backup management
**Key Points:**
- Built on Restic
- Web interface
- Supports many backends
- Encryption by default
- Deduplication saves space
- Cron-based scheduling
- Easy restore interface
**Remember:**
- Mount volumes to backup
- Set retention policies
- Test restores regularly
- Off-site backup recommended
- Keep repository password safe
- Monitor backup success
- Schedule during low usage
Backrest makes Restic backups manageable!
**Backrest provides enterprise-grade backup capabilities with an intuitive web interface, making it the perfect default backup solution for AI-Homelab.**

View File

@@ -0,0 +1,532 @@
# Sablier - Lazy Loading Service
## Table of Contents
- [Overview](#overview)
- [What is Sablier?](#what-is-sablier)
- [Why Use Sablier?](#why-use-sablier)
- [How It Works](#how-it-works)
- [Configuration in AI-Homelab](#configuration-in-ai-homelab)
- [Official Resources](#official-resources)
- [Educational Resources](#educational-resources)
- [Docker Configuration](#docker-configuration)
- [Using Sablier](#using-sablier)
- [Advanced Topics](#advanced-topics)
- [Troubleshooting](#troubleshooting)
## Overview
**Category:** Core Infrastructure
**Docker Image:** [sablierapp/sablier](https://hub.docker.com/r/sablierapp/sablier)
**Default Stack:** `core.yml`
**Web UI:** No web UI (API only)
**Authentication:** None required
**Purpose:** On-demand container startup and resource management
## What is Sablier?
Sablier is a lightweight service that enables lazy loading for Docker containers. It automatically starts containers when they're accessed through Traefik and stops them after a period of inactivity, helping to conserve system resources and reduce power consumption.
### Key Features
- **On-Demand Startup:** Containers start automatically when accessed
- **Automatic Shutdown:** Containers stop after configurable inactivity periods
- **Traefik Integration:** Works seamlessly with Traefik reverse proxy
- **Resource Conservation:** Reduces memory and CPU usage for unused services
- **Group Management:** Related services can be managed as groups
- **Health Checks:** Waits for services to be ready before forwarding traffic
- **Minimal Overhead:** Lightweight with low resource requirements
## Why Use Sablier?
1. **Resource Efficiency:** Save memory and CPU by only running services when needed
2. **Power Savings:** Reduce power consumption on always-on systems
3. **Faster Boot:** Services start quickly when accessed vs. waiting for full system startup
4. **Scalability:** Handle more services than would fit in memory simultaneously
5. **Cost Effective:** Lower resource requirements mean smaller/fewer servers needed
6. **Environmental:** Reduce energy consumption and carbon footprint
## How It Works
```
User Request → Traefik → Sablier Check → Container Start → Health Check → Forward Traffic
Container Stop (after timeout)
```
When a request comes in for a service with Sablier enabled:
1. **Route Detection:** Sablier monitors Traefik routes for configured services
2. **Container Check:** Verifies if the target container is running
3. **Startup Process:** If not running, starts the container via Docker API
4. **Health Verification:** Waits for the service to report healthy
5. **Traffic Forwarding:** Routes traffic to the now-running service
6. **Timeout Monitoring:** Tracks inactivity and stops containers after timeout
## Configuration in AI-Homelab
Sablier is deployed as part of the core infrastructure stack and requires no additional configuration for basic operation. It automatically discovers services with the appropriate labels.
### Service Integration
Add these labels to any service that should use lazy loading:
```yaml
services:
myservice:
# ... other configuration ...
labels:
- "sablier.enable=true"
- "sablier.group=core-myservice" # Optional: group related services
- "traefik.enable=true"
- "traefik.http.routers.myservice.rule=Host(`myservice.${DOMAIN}`)"
# ... other Traefik labels ...
```
### Advanced Configuration
For services requiring custom timeouts or group management:
```yaml
labels:
- "sablier.enable=true"
- "sablier.group=media-services" # Group name for related services
- "sablier.timeout=300" # 5 minutes inactivity timeout (default: 300)
- "sablier.theme=dark" # Optional: theme for Sablier UI (if used)
```
## Official Resources
- **GitHub Repository:** https://github.com/sablierapp/sablier
- **Docker Hub:** https://hub.docker.com/r/sablierapp/sablier
- **Documentation:** https://sablierapp.github.io/sablier/
## Educational Resources
- **Traefik Integration:** https://doc.traefik.io/traefik/middlewares/http/forwardauth/
- **Docker Lazy Loading:** Search for "docker lazy loading" or "container on-demand"
- **Resource Management:** Linux container resource management best practices
## Docker Configuration
### Environment Variables
| Variable | Description | Default | Required |
|----------|-------------|---------|----------|
| `SABLIER_PROVIDER` | Container runtime provider | `docker` | Yes |
| `SABLIER_DOCKER_API_VERSION` | Docker API version | `1.53` | No |
| `SABLIER_DOCKER_NETWORK` | Docker network for containers | `traefik-network` | Yes |
| `SABLIER_LOG_LEVEL` | Logging level (debug, info, warn, error) | `debug` | No |
| `DOCKER_HOST` | Docker socket endpoint | `tcp://docker-proxy:2375` | Yes |
### Ports
- **10000** - Sablier API endpoint (internal use only)
### Volumes
None required - Sablier communicates with Docker via API
### Networks
- **traefik-network** - Required for communication with Traefik
- **homelab-network** - Required for Docker API access
## Using Sablier
### Basic Usage
1. **Enable on Service:** Add `sablier.enable=true` label to any service
2. **Access Service:** Navigate to the service URL in your browser
3. **Automatic Startup:** Sablier detects the request and starts the container
4. **Wait for Ready:** Service starts and health checks pass
5. **Use Service:** Container is now running and accessible
6. **Automatic Shutdown:** Container stops after 5 minutes of inactivity
### Monitoring Lazy Loading
Check which services are managed by Sablier:
```bash
# View all containers with Sablier labels
docker ps --filter "label=sablier.enable=true" --format "table {{.Names}}\t{{.Status}}"
# Check Sablier logs
docker logs sablier
# View Traefik routes that trigger lazy loading
docker logs traefik | grep sablier
```
### Service Groups
Group related services that should start/stop together:
```yaml
# Database and web app in same group
services:
myapp:
labels:
- "sablier.enable=true"
- "sablier.group=myapp-stack"
myapp-db:
labels:
- "sablier.enable=true"
- "sablier.group=myapp-stack"
```
### Custom Timeouts
Set different inactivity timeouts per service:
```yaml
labels:
- "sablier.enable=true"
- "sablier.timeout=600" # 10 minutes
```
## Advanced Topics
### Performance Considerations
- **Startup Time:** Services take longer to respond on first access
- **Resource Spikes:** Multiple services starting simultaneously can cause load
- **Health Checks:** Ensure services have proper health checks for reliable startup
### Troubleshooting Startup Issues
- **Container Won't Start:** Check Docker logs for the failing container
- **Health Check Fails:** Verify service health endpoints are working
- **Network Issues:** Ensure containers are on the correct Docker network
### Integration with Monitoring
Sablier works with existing monitoring:
- **Prometheus:** Can monitor Sablier API metrics
- **Grafana:** Visualize container start/stop events
- **Dozzle:** View logs from lazy-loaded containers
## Troubleshooting
### Service Won't Start Automatically
**Symptoms:** Accessing service URL shows connection error
**Solutions:**
```bash
# Check if Sablier is running
docker ps | grep sablier
# Verify service has correct labels
docker inspect container-name | grep sablier
# Check Sablier logs
docker logs sablier
# Test manual container start
docker start container-name
```
### Containers Not Stopping
**Symptoms:** Containers remain running after inactivity timeout
**Solutions:**
```bash
# Check timeout configuration
docker inspect container-name | grep sablier.timeout
# Verify Sablier has access to Docker API
docker exec sablier curl -f http://docker-proxy:2375/_ping
# Check for active connections
netstat -tlnp | grep :port
```
### Traefik Routing Issues
**Symptoms:** Service accessible but Sablier not triggering
**Solutions:**
```bash
# Verify Traefik labels
docker inspect container-name | grep traefik
# Check Traefik configuration
docker logs traefik | grep "Creating router"
# Test direct access (bypass Sablier)
curl http://container-name:port/health
```
### Common Issues
**Issue:** Services start but are not accessible
**Fix:** Ensure services are on the `traefik-network`
**Issue:** Sablier can't connect to Docker API
**Fix:** Verify `DOCKER_HOST` environment variable and network connectivity
**Issue:** Containers start but health checks fail
**Fix:** Add proper health checks to service configurations
**Issue:** High resource usage during startup
**Fix:** Stagger service startups or increase system resources
### Performance Tuning
- **Increase Timeouts:** For services that need longer inactivity periods
- **Group Services:** Related services can share startup/shutdown cycles
- **Monitor Resources:** Use Glances or Prometheus to track resource usage
- **Optimize Health Checks:** Ensure health checks are fast and reliable
### Getting Help
- **GitHub Issues:** https://github.com/sablierapp/sablier/issues
- **Community:** Check Traefik and Docker forums for lazy loading discussions
- **Logs:** Enable debug logging with `SABLIER_LOG_LEVEL=debug`
- "sablier.start-on-demand=true" # Enable lazy loading
```
### Traefik Middleware
Configure Sablier middleware in Traefik dynamic configuration:
```yaml
http:
middlewares:
sablier-service:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: core-service-name
sessionDuration: 2m # How long to keep service running after access
ignoreUserAgent: curl # Don't start service for curl requests
dynamic:
displayName: Service Name
theme: ghost
show-details-by-default: true
```
## Examples
### Basic Service with Lazy Loading
```yaml
services:
my-service:
image: my-service:latest
container_name: my-service
restart: "no" # Important: Must be "no" for Sablier to control start/stop
networks:
- traefik-network
labels:
- "traefik.enable=true"
- "traefik.http.routers.my-service.rule=Host(`my-service.${DOMAIN}`)"
- "traefik.http.routers.my-service.entrypoints=websecure"
- "traefik.http.routers.my-service.tls.certresolver=letsencrypt"
- "traefik.http.routers.my-service.middlewares=authelia@docker"
- "traefik.http.services.my-service.loadbalancer.server.port=8080"
# Sablier lazy loading
- "sablier.enable=true"
- "sablier.group=core-my-service"
- "sablier.start-on-demand=true"
```
### Remote Service Proxy
For services on remote servers, configure Traefik routes with Sablier middleware:
```yaml
# In /opt/stacks/core/traefik/dynamic/remote-services.yml
http:
routers:
remote-service:
rule: "Host(`remote-service.${DOMAIN}`)"
entryPoints:
- websecure
service: remote-service
tls:
certResolver: letsencrypt
middlewares:
- sablier-remote-service@file
services:
remote-service:
loadBalancer:
servers:
- url: "http://remote-server-ip:port"
middlewares:
sablier-remote-service:
plugin:
sablier:
sablierUrl: http://sablier-service:10000
group: remote-server-group
sessionDuration: 5m
displayName: Remote Service
```
## Troubleshooting
### Service Won't Start
**Check Sablier logs:**
```bash
cd /opt/stacks/core
docker compose logs sablier-service
```
**Verify container permissions:**
```bash
# Check if Sablier can access Docker API
docker exec sablier-service curl -f http://localhost:10000/health
```
### Services Not Starting on Demand
**Check Traefik middleware configuration:**
```bash
# Verify middleware is loaded
docker logs traefik | grep sablier
```
**Check service labels:**
```bash
# Verify Sablier labels are present
docker inspect service-name | grep sablier
```
### Services Stop Too Quickly
**Increase session duration:**
```yaml
middlewares:
sablier-service:
plugin:
sablier:
sessionDuration: 10m # Increase from default
```
### Performance Issues
**Check resource usage:**
```bash
docker stats sablier-service
```
**Monitor Docker API calls:**
```bash
docker logs sablier-service | grep "API call"
```
## Best Practices
### Resource Management
- Use lazy loading for services that aren't accessed frequently
- Set appropriate session durations based on usage patterns
- Monitor resource usage to ensure adequate system capacity
### Configuration
- **Always set `restart: "no"`** for Sablier-managed services to allow full lifecycle control
- Group related services together for coordinated startup
- Use descriptive display names for the loading page
- Configure appropriate timeouts for your use case
### Security
- Sablier runs with Docker API access - ensure proper network isolation
- Use Docker socket proxy for additional security
- Monitor Sablier logs for unauthorized access attempts
## Integration with Other Services
### Homepage Dashboard
Add Sablier status to Homepage:
```yaml
# In homepage config
- Core Infrastructure:
- Sablier:
icon: docker.png
href: http://sablier-service:10000
description: Lazy loading service
widget:
type: iframe
url: http://sablier-service:10000
```
### Monitoring
Monitor Sablier with Prometheus metrics (if available) or basic health checks:
```yaml
# Health check
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:10000/health"]
interval: 30s
timeout: 10s
retries: 3
```
## Advanced Configuration
### Custom Themes
Sablier supports different loading page themes:
```yaml
dynamic:
displayName: My Service
theme: ghost # Options: ghost, hacker, ocean, etc.
show-details-by-default: true
```
### Group Management
Services can be grouped for coordinated startup:
```yaml
# All services in the same group start together
labels:
- "sablier.group=media-stack"
- "sablier.enable=true"
- "sablier.start-on-demand=true"
```
### API Access
Sablier provides a REST API for programmatic control:
```bash
# Get service status
curl http://sablier-service:10000/api/groups
# Start a service group
curl -X POST http://sablier-service:10000/api/groups/media-stack/start
# Stop a service group
curl -X POST http://sablier-service:10000/api/groups/media-stack/stop
```
## Migration from Manual Management
When adding Sablier to existing services:
1. **Change restart policy** to `"no"` in the compose file (critical for Sablier control)
2. **Add Sablier labels** to the service compose file
3. **Configure Traefik middleware** for the service
4. **Stop the service** initially (let Sablier manage it)
5. **Test access** - service should start automatically
6. **Monitor logs** to ensure proper operation
> **Important**: Services managed by Sablier must have `restart: "no"` to allow Sablier full control over container lifecycle. Do not use `unless-stopped`, `always`, or `on-failure` restart policies.
## Related Documentation
- [Traefik Documentation](traefik.md) - Reverse proxy configuration
- [Authelia Documentation](authelia.md) - SSO authentication
- [On-Demand Remote Services](../Ondemand-Remote-Services.md) - Remote service setup guide

View File

@@ -502,3 +502,16 @@ Traefik is the heart of your homelab's networking infrastructure. It:
- Scales from simple to complex setups
Understanding Traefik is crucial for managing your homelab effectively. Take time to explore the dashboard and understand how routing works - it will make troubleshooting and adding new services much easier.
## Related Services
- **[Authelia](authelia.md)** - SSO authentication that integrates with Traefik
- **[Sablier](sablier.md)** - Lazy loading that works with Traefik routing
- **[DuckDNS](duckdns.md)** - Dynamic DNS for SSL certificate validation
- **[Gluetun](gluetun.md)** - VPN routing that can work alongside Traefik
## See Also
- **[Traefik Labels Guide](../docker-guidelines.md#traefik-label-patterns)** - How to configure services for Traefik
- **[SSL Certificate Setup](../getting-started.md#notes-about-ssl-certificates-from-letsencrypt-with-duckdns)** - How SSL certificates work with Traefik
- **[External Host Proxying](../proxying-external-hosts.md)** - Route non-Docker services through Traefik

View File

@@ -209,7 +209,7 @@ tar -czf vaultwarden-backup-$(date +%Y%m%d).tar.gz \
# Start container
docker start vaultwarden
# Or use Backrest/Duplicati for automatic backups
# Or use Backrest (default) for automatic backups
```
## Summary

View File

@@ -1,6 +1,6 @@
# Services Overview
This document provides a comprehensive overview of all 60+ pre-configured services available in the AI-Homelab repository.
This document provides a comprehensive overview of all 70+ pre-configured services available in the AI-Homelab repository.
## Services Overview

View File

@@ -1,9 +1,11 @@
# AI-Homelab Setup Scripts
This directory contains two scripts for automated AI-Homelab deployment:
This directory contains scripts for automated AI-Homelab deployment and management:
1. **setup-homelab.sh** - System preparation
2. **deploy-homelab.sh** - Core infrastructure deployment
3. **reset-test-environment.sh** - Safe test environment cleanup
4. **reset-ondemand-services.sh** - Reload services for Sablier lazy loading
## setup-homelab.sh
@@ -186,3 +188,135 @@ cd /opt/stacks/infrastructure && docker compose up -d
- Waits up to 60 seconds for Dockge to become ready
- Automatically copies .env to stack directories
- Safe to run multiple times (idempotent)
---
## reset-test-environment.sh
Safe cleanup script for testing environments. Completely removes all deployed services, data, and configurations while preserving the underlying system setup. Intended for development and testing scenarios only.
### What It Does
1. **Stop All Stacks** - Gracefully stops dashboards, infrastructure, and core stacks
2. **Preserve SSL Certificates** - Backs up `acme.json` to the repository folder for reuse
3. **Remove Docker Volumes** - Deletes all homelab-related named volumes (data will be lost)
4. **Clean Stack Directories** - Removes `/opt/stacks/core`, `/opt/stacks/infrastructure`, `/opt/stacks/dashboards`
5. **Clear Dockge Data** - Removes Dockge's persistent data directory
6. **Clean Temporary Files** - Removes temporary files and setup artifacts
7. **Remove Networks** - Deletes homelab-network, traefik-network, dockerproxy-network, media-network
8. **Prune Resources** - Runs Docker system prune to clean up unused resources
### Usage
```bash
cd ~/AI-Homelab
# Make the script executable (if needed)
chmod +x scripts/reset-test-environment.sh
# Run with sudo (required for system cleanup)
sudo ./scripts/reset-test-environment.sh
```
### Safety Features
- **Confirmation Required** - Must type "yes" to confirm reset
- **Root Check** - Ensures running with sudo but not as root user
- **Colored Output** - Clear visual feedback for each step
- **Error Handling** - Continues with warnings if some operations fail
- **Preserves System** - Docker, packages, user groups, and firewall settings remain intact
### After Running
The system will be returned to a clean state ready for re-deployment:
1. Ensure `.env` file is properly configured
2. Run: `sudo ./scripts/setup-homelab.sh` (if needed)
3. Run: `./scripts/deploy-homelab.sh`
### Requirements
- Docker and Docker Compose installed
- Root access (via sudo)
- Existing AI-Homelab deployment
### Warnings
- **DATA LOSS** - All application data, databases, and configurations will be permanently deleted
- **SSL Certificates** - Preserved in repository folder but must be manually restored if needed
- **Production Use** - This script is for testing only - DO NOT use in production environments
### Notes
- Preserves Docker installation and system packages
- Maintains user group memberships and firewall rules
- SSL certificates are backed up to `~/AI-Homelab/acme.json`
- Safe to run multiple times
- Provides clear next steps after completion
---
## reset-ondemand-services.sh
Service management script for Sablier lazy loading. Restarts stacks to reload configuration changes and stops web services so Sablier can control them on-demand, while keeping databases running.
### What It Does
1. **Restart Stacks** - Brings down and back up various service stacks to reload compose file changes
2. **Stop Web Services** - Stops containers with `sablier.enable=true` label so Sablier can start them on-demand
3. **Preserve Databases** - Leaves database containers running for data persistence
### Supported Stacks
The script manages the following stacks:
- arr-stack (Sonarr, Radarr, Prowlarr)
- backrest (backup management)
- bitwarden (password manager)
- bookstack (documentation)
- code-server (VS Code server)
- dokuwiki (wiki)
- dozzle (log viewer)
- duplicati (alternative backup)
- formio (form builder)
- gitea (git server)
- glances (system monitor)
- mealie (recipe manager)
- mediawiki (wiki)
- nextcloud (cloud storage)
- tdarr (media processing)
- unmanic (media optimization)
- wordpress (blog/CMS)
### Usage
```bash
cd ~/AI-Homelab
# Make the script executable (if needed)
chmod +x scripts/reset-ondemand-services.sh
# Run as regular user (docker group membership required)
./scripts/reset-ondemand-services.sh
```
### When to Use
- After modifying compose files for Sablier lazy loading configuration
- When services need to reload configuration changes
- To ensure Sablier has control over web service startup
- During initial setup of lazy loading for multiple services
### Requirements
- Docker and Docker Compose installed
- User must be in docker group
- Sablier must be running in core stack
- Service stacks must be deployed
### Notes
- Handles different compose file naming conventions (.yml vs .yaml)
- Stops only services with Sablier labels enabled
- Databases remain running to preserve data
- Safe to run multiple times
- Provides clear feedback on operations performed

View File

@@ -1,202 +0,0 @@
#!/bin/bash
# AI-Homelab Resource Limits Application Script
# Applies researched resource limits to all Docker Compose stacks
# Run as: sudo ./apply-resource-limits.sh
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
log_success() {
echo -e "${GREEN}[SUCCESS]${NC} $1"
}
log_warning() {
echo -e "${YELLOW}[WARNING]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
# Check if running as root
if [ "$EUID" -ne 0 ]; then
log_error "Please run as root (sudo ./apply-resource-limits.sh)"
exit 1
fi
# Get actual user
ACTUAL_USER="${SUDO_USER:-$USER}"
log_info "Applying researched resource limits to all stacks..."
echo ""
# Function to add resource limits to a service in docker-compose.yml
add_resource_limits() {
local compose_file="$1"
local service_name="$2"
local template="$3"
# Define resource limits based on template
case $template in
"lightweight")
limits="cpus: '0.25'\n memory: 128M\n pids: 256"
reservations="cpus: '0.10'\n memory: 64M"
;;
"web")
limits="cpus: '0.50'\n memory: 256M\n pids: 512"
reservations="cpus: '0.25'\n memory: 128M"
;;
"database")
limits="cpus: '1.0'\n memory: 1G\n pids: 1024"
reservations="cpus: '0.50'\n memory: 512M"
;;
"media")
limits="cpus: '2.0'\n memory: 2G\n pids: 2048"
reservations="cpus: '1.0'\n memory: 1G"
;;
"downloader")
limits="cpus: '1.0'\n memory: 512M\n pids: 1024"
reservations="cpus: '0.50'\n memory: 256M"
;;
"heavy")
limits="cpus: '1.5'\n memory: 1G\n pids: 2048"
reservations="cpus: '0.75'\n memory: 512M"
;;
"monitoring")
limits="cpus: '0.75'\n memory: 512M\n pids: 1024"
reservations="cpus: '0.25'\n memory: 256M"
;;
*)
log_warning "Unknown template: $template for $service_name"
return
;;
esac
# Check if service already has deploy.resources
if grep -A 10 " $service_name:" "$compose_file" | grep -q "deploy:"; then
log_warning "$service_name in $compose_file already has deploy section - skipping"
return
fi
# Find the service definition and add deploy.resources after the image line
if grep -q "^ $service_name:" "$compose_file"; then
# Create a temporary file with the deploy section
local deploy_section=" deploy:
resources:
limits:
$limits
reservations:
$reservations"
# Use awk to insert the deploy section after the image line
awk -v service="$service_name" -v deploy="$deploy_section" '
/^ '"$service_name"':/ { in_service=1 }
in_service && /^ image:/ {
print $0
print deploy
in_service=0
next
}
{ print }
' "$compose_file" > "${compose_file}.tmp" && mv "${compose_file}.tmp" "$compose_file"
log_success "Added $template limits to $service_name in $(basename "$compose_file")"
else
log_warning "Service $service_name not found in $compose_file"
fi
}
# Process each stack
STACKS_DIR="/opt/stacks"
# Core stack (already has some limits)
log_info "Processing core stack..."
if [ -f "$STACKS_DIR/core/docker-compose.yml" ]; then
# DuckDNS is already done, check if others need limits
if ! grep -A 5 " authelia:" "$STACKS_DIR/core/docker-compose.yml" | grep -q "deploy:"; then
add_resource_limits "$STACKS_DIR/core/docker-compose.yml" "authelia" "lightweight"
fi
fi
# Infrastructure stack
log_info "Processing infrastructure stack..."
if [ -f "$STACKS_DIR/infrastructure/docker-compose.yml" ]; then
add_resource_limits "$STACKS_DIR/infrastructure/docker-compose.yml" "pihole" "lightweight"
add_resource_limits "$STACKS_DIR/infrastructure/docker-compose.yml" "dockge" "web"
add_resource_limits "$STACKS_DIR/infrastructure/docker-compose.yml" "glances" "web"
fi
# Dashboard stack
log_info "Processing dashboard stack..."
if [ -f "$STACKS_DIR/dashboards/docker-compose.yml" ]; then
add_resource_limits "$STACKS_DIR/dashboards/docker-compose.yml" "homepage" "web"
add_resource_limits "$STACKS_DIR/dashboards/docker-compose.yml" "homarr" "web"
fi
# Media stack
log_info "Processing media stack..."
if [ -f "$STACKS_DIR/media/docker-compose.yml" ]; then
add_resource_limits "$STACKS_DIR/media/docker-compose.yml" "jellyfin" "media"
add_resource_limits "$STACKS_DIR/media/docker-compose.yml" "calibre-web" "web"
fi
# Downloaders stack
log_info "Processing downloaders stack..."
if [ -f "$STACKS_DIR/downloaders/docker-compose.yml" ]; then
add_resource_limits "$STACKS_DIR/downloaders/docker-compose.yml" "qbittorrent" "downloader"
fi
# Home Assistant stack
log_info "Processing home assistant stack..."
if [ -f "$STACKS_DIR/homeassistant/docker-compose.yml" ]; then
add_resource_limits "$STACKS_DIR/homeassistant/docker-compose.yml" "homeassistant" "heavy"
add_resource_limits "$STACKS_DIR/homeassistant/docker-compose.yml" "esphome" "web"
add_resource_limits "$STACKS_DIR/homeassistant/docker-compose.yml" "nodered" "web"
fi
# Productivity stack
log_info "Processing productivity stack..."
if [ -f "$STACKS_DIR/productivity/docker-compose.yml" ]; then
add_resource_limits "$STACKS_DIR/productivity/docker-compose.yml" "nextcloud" "heavy"
add_resource_limits "$STACKS_DIR/productivity/docker-compose.yml" "gitea" "web"
fi
# Monitoring stack
log_info "Processing monitoring stack..."
if [ -f "$STACKS_DIR/monitoring/docker-compose.yml" ]; then
add_resource_limits "$STACKS_DIR/monitoring/docker-compose.yml" "prometheus" "monitoring"
add_resource_limits "$STACKS_DIR/monitoring/docker-compose.yml" "grafana" "web"
add_resource_limits "$STACKS_DIR/monitoring/docker-compose.yml" "loki" "monitoring"
add_resource_limits "$STACKS_DIR/monitoring/docker-compose.yml" "uptime-kuma" "web"
fi
# Development stack
log_info "Processing development stack..."
if [ -f "$STACKS_DIR/development/docker-compose.yml" ]; then
add_resource_limits "$STACKS_DIR/development/docker-compose.yml" "code-server" "heavy"
fi
# Fix ownership
chown -R "$ACTUAL_USER:$ACTUAL_USER" "$STACKS_DIR"
echo ""
log_success "Resource limits application complete!"
echo ""
log_info "Next steps:"
echo " 1. Review the applied limits: docker compose config"
echo " 2. Deploy updated stacks: docker compose up -d"
echo " 3. Monitor usage: docker stats"
echo " 4. Adjust limits as needed based on real usage"
echo ""
log_info "Note: These are conservative defaults based on typical usage patterns."
log_info "Monitor actual resource usage and adjust limits accordingly."

View File

@@ -0,0 +1,40 @@
#!/bin/bash
# Script to bring stacks down and back up (to reload compose file changes)
# then stop the webui containers so Sablier can control them
# leaving the databases running
stacks=(
"arr-stack"
"backrest"
"bitwarden"
"bookstack"
"code-server"
"dokuwiki"
"dozzle"
"duplicati"
"formio"
"gitea"
"glances"
"mealie"
"mediawiki"
"nextcloud"
"tdarr"
"unmanic"
"wordpress"
)
for stack in "${stacks[@]}"; do
if [[ "$stack" == "duplicati" || "$stack" == "formio" || "$stack" == "tdarr" ]]; then
file="/opt/stacks/$stack/docker-compose.yml"
else
file="/opt/stacks/$stack/docker-compose.yaml"
fi
echo "Starting $stack..."
docker compose -f "$file" stop && docker compose -f "$file" up -d
done
echo "Stopping web services to enable on-demand starting by Sablier..."
docker ps --filter "label=sablier.enable=true" --format "{{.Names}}" | xargs -r docker stop
echo "All stacks started (DBs running). Web services stopped and will be started on-demand by Sablier."