From afdc99a5a240a7e014195726f24f350d60e91fb2 Mon Sep 17 00:00:00 2001 From: kelin Date: Tue, 13 Jan 2026 21:46:01 -0500 Subject: [PATCH] Round 9: Complete documentation overhaul MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Major documentation updates reflecting current deployment state: README.md: ✅ Updated features list with automated setup capabilities ✅ Accurate deployment workflow (12 containers, 7 additional stacks) ✅ Corrected setup script description (automated Authelia secrets) ✅ Removed manual secret generation steps ✅ Added note about optional image pre-pull ✅ Clarified .env requirements (Authelia secrets now automated) docs/getting-started.md: ✅ Streamlined quick setup workflow (removed manual secrets) ✅ Comprehensive setup script capabilities list ✅ Detailed deployment script behavior ✅ Added interactive Authelia configuration details ✅ Clarified argon2id password hash generation process ✅ Updated login credentials location ✅ Removed outdated .env location warnings docs/quick-reference.md: ✅ Added deployment scripts reference section ✅ Complete setup-homelab.sh documentation ✅ Complete deploy-homelab.sh documentation ✅ Added reset-test-environment.sh with warnings ✅ Updated stack overview with container counts ✅ Clarified which stacks deploy by default vs available in Dockge ✅ Noted Watchtower temporary disable status docs/troubleshooting/COMMON-ISSUES.md (NEW): ✅ Installation issues (Docker permissions, timeouts, port conflicts) ✅ Deployment issues (Authelia loops, Watchtower status, Homepage URLs) ✅ Service-specific issues (Gluetun, Pi-hole, Dockge) ✅ Performance troubleshooting ✅ Reset and recovery procedures with warnings ✅ Complete getting help section with commands Total changes: 456 additions, 71 modifications across 4 files Documentation now accurately reflects Round 9 deployment capabilities --- README.md | 66 +++--- docs/getting-started.md | 90 ++++---- docs/quick-reference.md | 82 +++++-- docs/troubleshooting/COMMON-ISSUES.md | 293 ++++++++++++++++++++++++++ 4 files changed, 458 insertions(+), 73 deletions(-) create mode 100644 docs/troubleshooting/COMMON-ISSUES.md diff --git a/README.md b/README.md index aba522a..7f03ec0 100644 --- a/README.md +++ b/README.md @@ -21,15 +21,16 @@ The infrastructure uses Traefik for reverse proxy with automatic SSL, Authelia f ## Features - **AI-Powered Management**: GitHub Copilot integration with specialized instructions for Docker service management -- **Dockge Structure**: All stacks organized in `/opt/stacks/` for easy management via Dockge +- **Automated Setup & Deployment**: Two-script installation process with intelligent error handling +- **Dockge Structure**: All stacks organized in `/opt/stacks/` for easy management via Dockge web UI - **40+ Pre-configured Services**: Production-ready compose files across infrastructure, media, home automation, productivity, and monitoring - **Traefik Reverse Proxy**: Automatic HTTPS with Let's Encrypt via file-based configuration (no web UI needed) -- **Authelia SSO**: Single Sign-On protection for all admin interfaces with smart bypass rules for media apps +- **Authelia SSO**: Single Sign-On protection for all admin interfaces with automated password generation - **Gluetun VPN**: Surfshark WireGuard integration for secure downloads -- **Homepage Dashboard**: AI-configurable dashboard with Docker integration and service widgets +- **Homepage Dashboard**: AI-configurable dashboard with automatic domain variable replacement - **External Host Proxying**: Proxy external services (Raspberry Pi, routers, NAS) through Traefik - **Stack-Aware Changes**: AI considers the entire infrastructure when making changes -- **Comprehensive Documentation**: Detailed guidelines including proxying external hosts +- **Comprehensive Documentation**: Detailed guidelines including proxying external hosts and troubleshooting - **File-Based Configuration**: Everything managed via YAML files - no web UI dependencies ## Documentation @@ -62,13 +63,23 @@ The infrastructure uses Traefik for reverse proxy with automatic SSL, Authelia f cd AI-Homelab ``` -2. **(Optional) Run first-run setup script:** +2. **Run first-run setup script:** - For fresh Debian installations only, this automated script will install Docker Engine + Compose V2, configure user groups, detect NVIDIA GPU, and create directory structure. + This automated script handles system preparation and Authelia configuration. Safe to run on partially configured systems - it skips completed steps. ```bash sudo ./scripts/setup-homelab.sh ``` + + The script will: + - Install Docker Engine + Compose V2 (if needed) + - Configure user groups and firewall + - Detect NVIDIA GPU and offer driver installation + - **Generate Authelia secrets automatically** + - **Create admin user and password hash** + - Set up directory structure and Docker networks + + **Important:** Log out and back in (or run `newgrp docker`) after setup completes. 3. **Create and configure environment file:** ```bash @@ -81,41 +92,46 @@ The infrastructure uses Traefik for reverse proxy with automatic SSL, Authelia f **Required variables:** - `DOMAIN` - Your DuckDNS domain (e.g., yourdomain.duckdns.org) - - `DUCKDNS_TOKEN` - Your DuckDNS token + - `DUCKDNS_TOKEN` - Your DuckDNS token from [duckdns.org](https://www.duckdns.org/) - `ACME_EMAIL` - Your email for Let's Encrypt certificates - - `AUTHELIA_JWT_SECRET` - Generate with: `openssl rand -hex 64` - - `AUTHELIA_SESSION_SECRET` - Generate with: `openssl rand -hex 64` - - `AUTHELIA_STORAGE_ENCRYPTION_KEY` - Generate with: `openssl rand -hex 64` - - `SURFSHARK_USERNAME` and `SURFSHARK_PASSWORD` - If using VPN + - `SURFSHARK_USERNAME` and `SURFSHARK_PASSWORD` - If using VPN features + + **Note:** Authelia secrets (`AUTHELIA_JWT_SECRET`, `AUTHELIA_SESSION_SECRET`, `AUTHELIA_STORAGE_ENCRYPTION_KEY`) are automatically generated by the setup script. You can pre-generate them if desired using `openssl rand -hex 64`. > See [Getting Started](docs/getting-started.md) for detailed instructions 4. **Run deployment script:** This automated script will: - - Create Docker networks - - Configure Traefik with your email - - Generate Authelia admin password (saved to `/opt/stacks/core/authelia/ADMIN_PASSWORD.txt`) - - Deploy core stack (DuckDNS, Traefik, Authelia, Gluetun) - - Deploy infrastructure stack (Dockge, Pi-hole, monitoring tools) - - Deploy dashboards stack (Homepage, Homarr) + - Configure Traefik with your email and domain + - Deploy admin password from setup script + - Deploy core stack (DuckDNS, Traefik, Authelia, Gluetun) - 4 services + - Deploy infrastructure stack (Dockge, Pi-hole, monitoring) - 6 services + - Deploy dashboards stack (Homepage with configured URLs, Homarr) - 2 services + - **Prepare 7 additional stacks in Dockge** (not started, ready to deploy) + - Wait for services to be healthy - Open Dockge in your browser ```bash ./scripts/deploy-homelab.sh ``` - **Login credentials:** Username: `admin` | Password: Check `/opt/stacks/core/authelia/ADMIN_PASSWORD.txt` + **Login credentials:** Check script output or `/opt/stacks/core/authelia/ADMIN_PASSWORD.txt` + + **Note:** The script will prompt to optionally pre-pull images for additional stacks. This takes time but speeds up future deployments. Default is no. 5. **Deploy additional stacks through Dockge:** - Log in to Dockge at `https://dockge.yourdomain.duckdns.org` and deploy additional stacks from the repository's `docker-compose/` directory: - - `media.yml` - Plex, Jellyfin, Sonarr, Radarr, etc. - - `media-extended.yml` - Readarr, Lidarr, etc. - - `homeassistant.yml` - Home Assistant and accessories - - `productivity.yml` - Nextcloud, Gitea, wikis - - `monitoring.yml` - Grafana, Prometheus, etc. - - `utilities.yml` - Backups, code editors, etc. + Log in to Dockge at `https://dockge.yourdomain.duckdns.org` - all stacks are already loaded and ready to deploy: + - **media** - Plex, Jellyfin, Sonarr, Radarr, Prowlarr, qBittorrent + - **media-extended** - Readarr, Lidarr, Mylar, Calibre + - **homeassistant** - Home Assistant, Node-RED, Zigbee2MQTT, ESPHome + - **productivity** - Nextcloud, Gitea, Bookstack, Outline, Excalidraw + - **monitoring** - Grafana, Prometheus, Uptime Kuma, Netdata + - **utilities** - Duplicati, Code Server, FreshRSS, Wallabag + - **alternatives** - Portainer, Authentik (alternative to Dockge/Authelia) + + Simply click any stack in Dockge and press "Start" to deploy it. 6. **Configure VS Code to control the server via GitHub Copilot** diff --git a/docs/getting-started.md b/docs/getting-started.md index 8a2876b..55e0681 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -27,42 +27,41 @@ For most users, the automated setup script handles everything: 4. **Configure environment**: ```bash cp .env.example .env - nano .env # Edit with your settings and paste the Authelia secrets + nano .env # Edit with your domain and tokens ``` - **Testing considerations: .env File Location** - - The `.env` file should remain in the **repository folder** (`~/AI-Homelab/.env`) - - The deploy script will automatically copy it to `/opt/stacks/*/` as needed - - Always edit the repo copy, not the deployed copies - - Changes to deployed copies will be overwritten on next deployment - **Required variables in .env:** - `DOMAIN` - Your DuckDNS domain (e.g., yourdomain.duckdns.org) - - `DUCKDNS_TOKEN` - Your DuckDNS token + - `DUCKDNS_TOKEN` - Your DuckDNS token from [duckdns.org](https://www.duckdns.org/) - `ACME_EMAIL` - Your email for Let's Encrypt certificates - - `AUTHELIA_JWT_SECRET` - Generated in step 6 - - `AUTHELIA_SESSION_SECRET` - Generated in step 6 - - `AUTHELIA_STORAGE_ENCRYPTION_KEY` - Generated in step 6 - `SURFSHARK_USERNAME` and `SURFSHARK_PASSWORD` - If using VPN + + **Note:** The `.env` file stays in the repository folder (`~/AI-Homelab/.env`). The deploy script copies it to stack directories automatically. Authelia secrets are generated by the setup script. -5. **Run the setup script** - ```bash - sudo ./scripts/setup-homelab.sh +5. **Run the setup script:** + ```bash + sudo ./scripts/setup-homelab.sh + ``` + + The script will: + - Update system packages + - Install Docker Engine + Compose V2 (if needed) + - Configure user groups (docker, sudo) + - Set up firewall (UFW) + - Enable SSH server + - **Generate Authelia secrets** (JWT, session, encryption key) + - **Prompt for admin username, password, and email** + - **Generate argon2id password hash** (30-60 seconds) + - Create `/opt/stacks/` directory structure + - Set up Docker networks (homelab, traefik, dockerproxy, media) + - Detect NVIDIA GPU and offer driver installation + + **Important:** The script generates a secure admin password hash that will be used during deployment. Credentials are displayed at the end and saved to a temporary file. 6. **Log out and back in** (or run `newgrp docker`) - >Don't skip this step! + > Don't skip this step! Required for Docker group permissions. -7. **Generate Authelia Secrets**: - ```bash - # Generate three required secrets for Authelia (128 characters each) - echo "AUTHELIA_JWT_SECRET=$(openssl rand -hex 64)" - echo "AUTHELIA_SESSION_SECRET=$(openssl rand -hex 64)" - echo "AUTHELIA_STORAGE_ENCRYPTION_KEY=$(openssl rand -hex 64)" - - # Copy these values and add them to your .env file - ``` - -8. **Deploy homelab**: +7. **Deploy homelab**: ```bash ./scripts/deploy-homelab.sh ``` @@ -84,17 +83,36 @@ For most users, the automated setup script handles everything: ## What the Setup Script Does -The `setup-homelab.sh` script automatically: -- ✅ Updates system packages -- ✅ Installs Docker (if not present) -- ✅ Configures user permissions -- ✅ Sets up firewall (UFW) -- ✅ Enables SSH server -- ✅ Installs NVIDIA drivers (if GPU detected) -- ✅ Creates directory structure -- ✅ Sets up Docker networks +The `setup-homelab.sh` script is a comprehensive first-run configuration tool: -It safely skips steps that are already completed, so it's safe to run on partially configured systems. +**System Preparation:** +- ✅ Pre-flight checks (internet connectivity, disk space 50GB+) +- ✅ Updates system packages +- ✅ Installs required packages (git, curl, etc.) +- ✅ Installs Docker Engine + Compose V2 (if not present) +- ✅ Configures user permissions (docker, sudo groups) +- ✅ Sets up firewall (UFW with SSH, HTTP, HTTPS) +- ✅ Enables SSH server + +**Authelia Configuration (Interactive):** +- ✅ Generates three cryptographic secrets (JWT, session, encryption) +- ✅ Prompts for admin username (default: admin) +- ✅ Prompts for secure password with confirmation +- ✅ Prompts for admin email address +- ✅ Generates argon2id password hash using Docker (30-60s process) +- ✅ Validates Docker is available before password operations +- ✅ Saves credentials securely for deployment script + +**Infrastructure Setup:** +- ✅ Creates directory structure (`/opt/stacks/`) +- ✅ Sets up Docker networks (homelab, traefik, dockerproxy, media) +- ✅ Detects NVIDIA GPU and offers driver installation + +**Safety Features:** +- Skips completed steps (safe to re-run) +- Timeout handling (60s for Docker operations) +- Comprehensive error messages with troubleshooting hints +- Exit on critical failures with clear next steps ## Manual Setup (Alternative) diff --git a/docs/quick-reference.md b/docs/quick-reference.md index 5da57ca..8c39c3a 100644 --- a/docs/quick-reference.md +++ b/docs/quick-reference.md @@ -4,19 +4,77 @@ Your homelab uses separate stacks for organization: -- **`core.yml`** - Essential infrastructure (Traefik, Authelia, DuckDNS, Gluetun) -- **`infrastructure.yml`** - Management tools (Dockge, Portainer, Pi-hole, etc.) -- **`dashboards.yml`** - Dashboard services (Homepage, Homarr) -- **`media.yml`** - Media services (Plex, Jellyfin, Sonarr, Radarr, etc.) -- **`media-extended.yml`** - Additional media tools -- **`homeassistant.yml`** - Home automation -- **`productivity.yml`** - Productivity apps (Nextcloud, Gitea, etc.) -- **`monitoring.yml`** - Monitoring stack (Grafana, Prometheus) -- **`utilities.yml`** - Utility services (backups, code server) -- **`development.yml`** - Development tools +**Deployed by default (12 containers):** +- **`core.yml`** - Essential infrastructure (Traefik, Authelia, DuckDNS, Gluetun) - 4 services +- **`infrastructure.yml`** - Management tools (Dockge, Pi-hole, Dozzle, Glances, Docker Proxy) - 6 services + _Note: Watchtower temporarily disabled due to Docker API compatibility_ +- **`dashboards.yml`** - Dashboard services (Homepage, Homarr) - 2 services -> The default stacks offer a wide selection of services -And can be modified by the AI suit your preferences. +**Available in Dockge (deploy as needed):** +- **`media.yml`** - Media services (Plex, Jellyfin, Sonarr, Radarr, Prowlarr, qBittorrent) +- **`media-extended.yml`** - Additional media tools (Readarr, Lidarr, Mylar, Calibre) +- **`homeassistant.yml`** - Home automation (Home Assistant, Node-RED, Zigbee2MQTT, ESPHome) +- **`productivity.yml`** - Productivity apps (Nextcloud, Gitea, Bookstack, Outline, Excalidraw) +- **`monitoring.yml`** - Monitoring stack (Grafana, Prometheus, Uptime Kuma, Netdata) +- **`utilities.yml`** - Utility services (Duplicati, Code Server, FreshRSS, Wallabag) +- **`alternatives.yml`** - Alternative tools (Portainer, Authentik) + +> All stacks can be modified by the AI to suit your preferences. + +## Deployment Scripts + +### setup-homelab.sh + +First-run system preparation and Authelia configuration: + +```bash +sudo ./scripts/setup-homelab.sh +``` + +**What it does:** +- Pre-flight checks (internet, disk space 50GB+) +- Installs Docker Engine + Compose V2 +- Configures user groups and firewall +- Generates Authelia secrets automatically +- Creates admin user with secure password hash +- Sets up Docker networks and directory structure +- Detects NVIDIA GPU and offers driver installation + +**Safe to re-run** - skips completed steps + +### deploy-homelab.sh + +Deploys all core services and prepares additional stacks: + +```bash +./scripts/deploy-homelab.sh +``` + +**What it does:** +- Validates Docker installation +- Deploys core stack (4 containers) +- Deploys infrastructure stack (6 containers) +- Deploys dashboards with configured URLs (2 containers) +- Copies 7 additional stacks to Dockge (not started) +- Optional: Pre-pull images for additional stacks +- Opens Dockge in browser + +### reset-test-environment.sh + +**Testing/development only** - safely removes all deployed services: + +```bash +sudo ./scripts/reset-test-environment.sh +``` + +**What it does:** +- Gracefully stops all Docker Compose stacks +- Removes Docker volumes (after 3s wait) +- Cleans stack directories and temp files +- Removes Docker networks +- Prunes unused Docker resources + +**Warning:** This is destructive! Only use for testing or complete reset. ## Common Commands diff --git a/docs/troubleshooting/COMMON-ISSUES.md b/docs/troubleshooting/COMMON-ISSUES.md new file mode 100644 index 0000000..caa0d43 --- /dev/null +++ b/docs/troubleshooting/COMMON-ISSUES.md @@ -0,0 +1,293 @@ +# Common Issues and Solutions + +## Installation Issues + +### Docker Group Permissions + +**Symptom:** `permission denied while trying to connect to the Docker daemon socket` + +**Solution:** +```bash +# After running setup script, you must log out and back in +exit # or logout + +# Or without logging out: +newgrp docker +``` + +### Password Hash Generation Timeout + +**Symptom:** Password hash generation takes longer than 60 seconds + +**Causes:** +- High CPU usage from other processes +- Slow system (argon2 is computationally intensive) + +**Solutions:** +```bash +# Check system resources +top +# or +htop + +# If system is slow, reduce argon2 iterations (less secure but faster) +# This is handled automatically by Authelia - just wait +# On very slow systems, it may take up to 2 minutes +``` + +### Port Conflicts + +**Symptom:** `bind: address already in use` + +**Solution:** +```bash +# Check what's using the port +sudo lsof -i :80 +sudo lsof -i :443 + +# Common culprits: +# - Apache: sudo systemctl stop apache2 +# - Nginx: sudo systemctl stop nginx +# - Another container: docker ps (find and stop it) +``` + +## Deployment Issues + +### Authelia Restart Loop + +**Symptom:** Authelia container keeps restarting + +**Common causes:** +1. **Password hash corruption** - Fixed in current version +2. **Encryption key mismatch** - Changed .env after initial deployment + +**Solution:** +```bash +# Check logs +sudo docker logs authelia + +# If encryption key error, reset Authelia database: +sudo ./scripts/reset-test-environment.sh +# Then run setup and deploy again +``` + +### Watchtower Issues + +**Status:** Temporarily disabled due to Docker API compatibility + +**Issue:** Docker 29.x requires API v1.44, but Watchtower versions have compatibility issues + +**Current state:** Commented out in infrastructure.yml with documentation + +**Manual updates instead:** +```bash +# Update all images in a stack +cd /opt/stacks/stack-name/ +docker compose pull +docker compose up -d +``` + +### Homepage Not Showing Correct URLs + +**Symptom:** Homepage shows `{{HOMEPAGE_VAR_DOMAIN}}` instead of actual domain + +**Cause:** Old deployment script version + +**Solution:** +```bash +# Re-run deployment script (safe - won't affect running services) +./scripts/deploy-homelab.sh + +# Or manually fix: +cd /opt/stacks/dashboards/homepage +sudo find . -name "*.yaml" -exec sed -i "s/{{HOMEPAGE_VAR_DOMAIN}}/yourdomain.duckdns.org/g" {} \; +``` + +### Services Not Accessible via HTTPS + +**Symptom:** Can't access services at https://service.yourdomain.duckdns.org + +**Solutions:** + +1. **Check Traefik is running:** +```bash +sudo docker ps | grep traefik +sudo docker logs traefik +``` + +2. **Verify DuckDNS is updating:** +```bash +sudo docker logs duckdns +# Should show "Your IP has been updated" +``` + +3. **Check ports are open:** +```bash +sudo ufw status +# Should show 80/tcp and 443/tcp ALLOW +``` + +4. **Verify domain resolves:** +```bash +nslookup yourdomain.duckdns.org +# Should return your public IP +``` + +## Service-Specific Issues + +### Gluetun VPN Not Connecting + +**Symptom:** Gluetun shows connection errors + +**Solutions:** +```bash +# Check credentials in .env +cat ~/AI-Homelab/.env | grep SURFSHARK + +# Check Gluetun logs +sudo docker logs gluetun + +# Common fixes: +# 1. Wrong server region +# 2. Invalid credentials +# 3. WireGuard not supported by provider +``` + +### Pi-hole DNS Not Working + +**Symptom:** Devices can't resolve DNS through Pi-hole + +**Solutions:** +```bash +# Check Pi-hole is running +sudo docker ps | grep pihole + +# Verify port 53 is available +sudo lsof -i :53 + +# If systemd-resolved is conflicting: +sudo systemctl disable systemd-resolved +sudo systemctl stop systemd-resolved +``` + +### Dockge Shows Empty + +**Symptom:** No stacks visible in Dockge + +**Cause:** Stacks not copied to /opt/stacks/ + +**Solution:** +```bash +# Check what exists +ls -la /opt/stacks/ + +# Re-run deployment to copy stacks +./scripts/deploy-homelab.sh +``` + +## Performance Issues + +### Slow Container Start Times + +**Causes:** +- First-time image pulls +- Slow disk (not using SSD/NVMe) +- Insufficient RAM + +**Solutions:** +```bash +# Pre-pull images +cd /opt/stacks/stack-name/ +docker compose pull + +# Check disk performance +sudo hdparm -Tt /dev/sda # Replace with your disk + +# Check RAM usage +free -h + +# Move /opt/stacks to faster disk if needed +``` + +### High CPU Usage from Authelia + +**Normal:** Argon2 password hashing is intentionally CPU-intensive for security + +**If persistent:** +```bash +# Check what's causing load +sudo docker stats + +# If Authelia constantly high: +sudo docker logs authelia +# Look for repeated authentication attempts (possible attack) +``` + +## Reset and Recovery + +### Complete Reset (Testing Only) + +**Warning:** This is destructive! + +```bash +# Use the safe reset script +sudo ./scripts/reset-test-environment.sh + +# Then re-run setup and deploy +sudo ./scripts/setup-homelab.sh +./scripts/deploy-homelab.sh +``` + +### Partial Reset (Single Stack) + +```bash +# Stop and remove specific stack +cd /opt/stacks/stack-name/ +docker compose down -v # -v removes volumes (data loss!) + +# Redeploy +docker compose up -d +``` + +### Backup Before Reset + +```bash +# Backup important data +sudo tar czf ~/homelab-backup-$(date +%Y%m%d).tar.gz /opt/stacks/ + +# Backup specific volumes +docker run --rm \ + -v stack_volume:/data \ + -v $(pwd):/backup \ + busybox tar czf /backup/volume-backup.tar.gz /data +``` + +## Getting Help + +1. **Check container logs:** + ```bash + sudo docker logs container-name + sudo docker logs -f container-name # Follow logs + ``` + +2. **Use Dozzle for real-time logs:** + Access at https://dozzle.yourdomain.duckdns.org + +3. **Check the AI assistant:** + Ask Copilot in VS Code for specific issues + +4. **Verify configuration:** + ```bash + # Check .env file + cat ~/AI-Homelab/.env + + # Check compose file + cat /opt/stacks/stack-name/docker-compose.yml + ``` + +5. **Docker system info:** + ```bash + docker info + docker version + docker system df # Disk usage + ```