diff --git a/AGENT_INSTRUCTIONS.md b/AGENT_INSTRUCTIONS.md index 7e746ed..a68081e 100644 --- a/AGENT_INSTRUCTIONS.md +++ b/AGENT_INSTRUCTIONS.md @@ -6,7 +6,7 @@ You are an AI agent specialized in managing Docker-based homelab infrastructure ## Repository Context - **Repository Location**: `/home/kelin/AI-Homelab/` - **Purpose**: Development and testing of automated homelab management via GitHub Copilot -- **Testing Phase**: Round 4 - Focus on stability, permission handling, and production readiness +- **Testing Phase**: Round 6 - Focus on script reliability, error handling, and deployment robustness - **User**: `kelin` (PUID=1000, PGID=1000) - **Critical**: All file operations must respect user ownership - avoid permission escalation issues @@ -158,7 +158,7 @@ labels: ## Agent Actions Checklist -### Permission Safety (CRITICAL for Round 4) +### Permission Safety (CRITICAL - Established in Round 4) - [ ] **NEVER** use sudo for file operations in user directories - [ ] Always check file ownership before modifying: `ls -la` - [ ] Respect existing ownership - files should be owned by `kelin:kelin` @@ -196,7 +196,7 @@ labels: ## Common Agent Tasks -### Development Workflow (Round 4 Focus) +### Development Workflow (Current Focus - Round 6) 1. **Repository Testing** - Test deployment scripts: `./scripts/setup-homelab.sh`, `./scripts/deploy-homelab.sh` - Verify compose file syntax across all stacks @@ -354,7 +354,7 @@ labels: - **Respect user ownership boundaries** - **Ask before modifying system directories** -## Testing and Development Guidelines (Round 4) +## Testing and Development Guidelines (Round 6) ### Repository Development - Work within `~/AI-Homelab/` for all development @@ -394,7 +394,7 @@ ls -la ~/AI-Homelab/ - [ ] VPN: Test Gluetun routing - [ ] Documentation: Validate all steps in docs/ -### Round 4 Success Criteria +### Round 6 Success Criteria - [ ] No permission-related crashes - [ ] All deployment scripts work on fresh Debian install - [ ] Documentation matches actual implementation diff --git a/ROUND_6_AGENT_INSTRUCTIONS.md b/ROUND_6_AGENT_INSTRUCTIONS.md new file mode 100644 index 0000000..0c009af --- /dev/null +++ b/ROUND_6_AGENT_INSTRUCTIONS.md @@ -0,0 +1,367 @@ +# Round 6 Testing - Agent Instructions for Script Optimization + +## Mission Context +Test and optimize the AI-Homelab deployment scripts ([setup-homelab.sh](scripts/setup-homelab.sh) and [deploy-homelab.sh](scripts/deploy-homelab.sh)) on a **test server** environment. Focus on improving repository code quality and reliability, not test server infrastructure. + +## Current Status - Round 6 +- **Test Environment**: Fresh Debian installation +- **Testing Complete**: ✅ Both scripts now working successfully +- **Issues Fixed**: + - Password hash extraction from Authelia output (added proper parsing) + - Credential flow between setup and deploy scripts (saved plain password) + - Directory vs file conflicts on re-runs (added cleanup logic) + - Authelia database encryption key mismatches (added auto-cleanup) + - Password display for user reference (now shows and saves password) +- **Previous Rounds**: 1-5 focused on permission handling, deployment order, and stability + +## Primary Objectives + +### 1. Script Reliability Analysis +- [ ] Identify why setup-homelab.sh failed during Authelia password hash generation +- [ ] Review error handling and fallback mechanisms +- [ ] Check Docker availability before container-based operations +- [ ] Validate all external dependencies (curl, openssl, docker) + +### 2. Repository Improvements (Priority Focus) +- [ ] Enhance error messages with actionable debugging steps +- [ ] Add pre-flight validation checks before critical operations +- [ ] Improve script idempotency (safe to run multiple times) +- [ ] Add progress indicators for long-running operations +- [ ] Document assumptions and prerequisites more clearly + +### 3. Testing Methodology +- [ ] Create test checklist for each script function +- [ ] Document expected vs. actual behavior at failure point +- [ ] Test edge cases (partial runs, re-runs, missing dependencies) +- [ ] Verify scripts work on truly fresh Debian installation + +## Critical Operating Principles + +### DO Focus On (Repository Optimization) +✅ **Script robustness**: Better error handling, validation, recovery +✅ **User experience**: Clear messages, progress indicators, helpful errors +✅ **Documentation**: Update docs to match actual script behavior +✅ **Idempotency**: Scripts can be safely re-run without breaking state +✅ **Dependency checks**: Validate requirements before attempting operations +✅ **Rollback capability**: Provide recovery steps when operations fail + +### DON'T Focus On (Test Server Infrastructure) +❌ **Test server setup**: Don't modify the test server's base configuration +❌ **System packages**: Test scripts should work with standard Debian +❌ **Hardware setup**: Focus on software reliability, not hardware optimization +❌ **Performance tuning**: Prioritize reliability over speed + +## Specific Issues to Address (Round 6) + +### Issue 1: Authelia Password Hash Generation Failure +**Problem**: Script failed while generating password hash using Docker container + +**Investigation Steps**: +1. Check if Docker daemon is running before attempting container operations +2. Verify Docker image pull capability (network connectivity) +3. Add timeout handling for Docker run operations +4. Provide fallback mechanism if Docker-based hash generation fails + +**Proposed Solutions**: +```bash +# Option 1: Pre-validate Docker availability +if ! docker info &> /dev/null 2>&1; then + log_error "Docker is not available for password hash generation" + log_info "This requires Docker to be running. Please ensure:" + log_info " 1. Docker daemon is started: sudo systemctl start docker" + log_info " 2. User can access Docker: docker ps" + exit 1 +fi + +# Option 2: Add timeout and error handling +log_info "Generating password hash (this may take a moment)..." +if ! PASSWORD_HASH=$(timeout 30 docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2 --password "$ADMIN_PASSWORD" 2>&1 | grep '^\$argon2'); then + log_error "Failed to generate password hash" + log_info "Error output:" + echo "$PASSWORD_HASH" + log_info "" + log_info "Troubleshooting steps:" + log_info " 1. Check Docker is running: docker ps" + log_info " 2. Test image pull: docker pull authelia/authelia:4.37" + log_info " 3. Check network connectivity" + exit 1 +fi + +# Option 3: Provide manual hash generation instructions +if [ -z "$PASSWORD_HASH" ]; then + log_error "Automated hash generation failed" + log_info "" + log_info "You can generate the hash manually after setup:" + log_info " docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2" + log_info " Then edit: /opt/stacks/core/authelia/users_database.yml" + log_info "" + read -p "Continue without setting admin password now? (y/N): " -n 1 -r + echo "" + if [[ ! $REPLY =~ ^[Yy]$ ]]; then + exit 1 + fi + # Set placeholder that requires manual configuration + PASSWORD_HASH="\$argon2id\$v=19\$m=65536,t=3,p=4\$CHANGEME" +fi +``` + +### Issue 2: Script Exit Codes and Error Recovery + +**Problem**: Script exits on first error without cleanup or recovery guidance + +**Improvements**: +```bash +# Add at script beginning +set -e # Exit on error +set -u # Exit on undefined variable +set -o pipefail # Exit on pipe failures + +# Add trap for cleanup on error +cleanup_on_error() { + local exit_code=$? + log_error "Script failed with exit code: $exit_code" + log_info "" + log_info "Partial setup may have occurred. To resume:" + log_info " 1. Review error messages above" + log_info " 2. Fix the issue if possible" + log_info " 3. Re-run: sudo ./setup-homelab.sh" + log_info "" + log_info "The script is designed to be idempotent (safe to re-run)" +} + +trap cleanup_on_error ERR +``` + +### Issue 3: Pre-flight Validation + +**Add comprehensive checks before starting**: +```bash +# New Step 0: Pre-flight validation +log_info "Step 0/9: Running pre-flight checks..." + +# Check internet connectivity +if ! ping -c 1 8.8.8.8 &> /dev/null && ! ping -c 1 1.1.1.1 &> /dev/null; then + log_error "No internet connectivity detected" + log_info "Internet access is required for:" + log_info " - Installing packages" + log_info " - Downloading Docker images" + log_info " - Accessing Docker Hub" + exit 1 +fi + +# Check DNS resolution +if ! nslookup google.com &> /dev/null; then + log_warning "DNS resolution may be impaired" +fi + +# Check disk space +AVAILABLE_SPACE=$(df / | tail -1 | awk '{print $4}') +REQUIRED_SPACE=5000000 # 5GB in KB +if [ "$AVAILABLE_SPACE" -lt "$REQUIRED_SPACE" ]; then + log_error "Insufficient disk space" + log_info "Available: $(($AVAILABLE_SPACE / 1024))MB" + log_info "Required: $(($REQUIRED_SPACE / 1024))MB" + exit 1 +fi + +log_success "Pre-flight checks passed" +echo "" +``` + +## Testing Checklist for Round 6 + +### setup-homelab.sh Testing +- [ ] **Clean run**: Fresh Debian 12 system with no prior setup +- [ ] **Re-run safety**: Run script twice, verify idempotency +- [ ] **Interrupted run**: Stop script mid-execution, verify recovery +- [ ] **Network failure**: Simulate network issues during package installation +- [ ] **Docker unavailable**: Test behavior when Docker daemon isn't running +- [ ] **User input validation**: Test invalid inputs (weak passwords, invalid emails) +- [ ] **NVIDIA skip**: Test skipping GPU driver installation +- [ ] **NVIDIA install**: Test GPU driver installation path (if hardware available) + +### deploy-homelab.sh Testing +- [ ] **Missing .env**: Test behavior when .env file doesn't exist +- [ ] **Invalid .env**: Test with incomplete or malformed .env values +- [ ] **Port conflicts**: Test when ports 80/443 are already in use +- [ ] **Network issues**: Simulate Docker network creation failures +- [ ] **Core stack failure**: Test recovery if core services fail to start +- [ ] **Certificate generation**: Monitor Let's Encrypt certificate acquisition +- [ ] **Re-deployment**: Test redeploying over existing infrastructure +- [ ] **Partial deployment**: Stop mid-deployment, verify recovery + +## Documentation Updates Required + +### 1. Update [docs/getting-started.md](docs/getting-started.md) +- Document actual script behavior observed in testing +- Add troubleshooting section for common failure scenarios +- Include recovery procedures for partial deployments +- Update Round 4 → Round 6 references + +### 2. Update [AGENT_INSTRUCTIONS.md](AGENT_INSTRUCTIONS.md) +- Change "Round 4" references to current round +- Add learnings from Round 5 and 6 testing +- Document new error handling patterns +- Update success criteria + +### 3. Create Testing Guide +- Document test environment setup +- Provide test case templates +- Include expected vs actual result tracking +- Create test automation scripts if beneficial + +## Agent Action Workflow + +### Phase 1: Analysis (30 minutes) +1. Review last terminal output to identify exact failure point +2. Read relevant script sections around the failure +3. Check for similar issues in previous rounds (git history) +4. Document root cause and contributing factors + +### Phase 2: Solution Design (20 minutes) +1. Propose specific code changes to address issues +2. Design error handling and recovery mechanisms +3. Create validation checks to prevent recurrence +4. Draft user-facing error messages + +### Phase 3: Implementation (40 minutes) +1. Apply changes to scripts using multi_replace_string_in_file +2. Update related documentation +3. Add comments explaining complex error handling +4. Update .env.example if new variables needed + +### Phase 4: Validation (30 minutes) +1. Review script syntax: `bash -n scripts/*.sh` +2. Check for common shell script issues (shellcheck if available) +3. Verify all error paths have recovery guidance +4. Test modified sections in isolation if possible + +### Phase 5: Documentation (20 minutes) +1. Update CHANGELOG or version notes +2. Document breaking changes if any +3. Update README if testing procedure changed +4. Create/update this round's testing summary + +## Success Criteria for Round 6 + +### Must Have +- ✅ setup-homelab.sh completes successfully on fresh Debian 12 +- ✅ Script provides clear error messages when failures occur +- ✅ All error conditions have recovery/troubleshooting guidance +- ✅ Scripts can be safely re-run without breaking existing setup +- ✅ Docker availability validated before Docker operations + +### Should Have +- ✅ Pre-flight validation catches issues before they cause failures +- ✅ Progress indicators show user what's happening +- ✅ Timeout handling prevents infinite hangs +- ✅ Network failures provide helpful next steps +- ✅ Documentation updated to match current behavior + +### Nice to Have +- ⭐ Automated testing framework for scripts +- ⭐ Rollback capability for partial deployments +- ⭐ Log collection for debugging +- ⭐ Summary report at end of execution +- ⭐ Colored output supports colorless terminals + +## Key Files to Modify + +### Primary Targets +1. [scripts/setup-homelab.sh](scripts/setup-homelab.sh) - Lines 1-491 + - Focus areas: Steps 7 (Authelia secrets), error handling +2. [scripts/deploy-homelab.sh](scripts/deploy-homelab.sh) - Lines 1-340 + - Focus areas: Pre-flight checks, error recovery +3. [AGENT_INSTRUCTIONS.md](AGENT_INSTRUCTIONS.md) - Round references +4. [docs/getting-started.md](docs/getting-started.md) - Testing documentation + +### Supporting Files +- [.env.example](.env.example) - Validate all variables documented +- [README.md](README.md) - Update if quick start changed +- [docs/quick-reference.md](docs/quick-reference.md) - Add troubleshooting entries + +## Constraints and Guidelines + +### Follow Repository Standards +- Maintain existing code style and conventions +- Use established logging functions (log_info, log_error, etc.) +- Keep user-facing messages clear and actionable +- Preserve idempotency where it exists + +### Avoid Scope Creep +- Don't rewrite entire scripts - make targeted improvements +- Don't add features - focus on fixing reliability issues +- Don't optimize performance - focus on correctness +- Don't modify test server - fix repository code + +### Safety First +- All changes must maintain backward compatibility +- Don't remove existing functionality +- Add validation, don't skip steps +- Provide rollback/recovery for all operations + +## Communication with User + +### Regular Updates +- Report progress through testing phases +- Highlight blockers or unexpected findings +- Ask for clarification when needed +- Summarize changes made + +### Final Report Format +```markdown +# Round 6 Testing Summary + +## Issues Identified +1. [Issue name] - [Brief description] + - Root cause: ... + - Impact: ... + +## Changes Implemented +1. [File]: [Change description] + - Why: ... + - Testing: ... + +## Testing Results +- ✅ [What works] +- ⚠️ [What needs attention] +- ❌ [What still fails] + +## Next Steps for Round 7 +1. [Remaining issue to address] +2. [Enhancement to consider] +``` + +--- + +## Immediate Actions for Agent + +1. **Investigate current failure**: + - Review terminal output showing exit code 1 + - Identify line in setup-homelab.sh where failure occurred + - Determine if Docker was available for password hash generation + +2. **Propose specific fixes**: + - Add Docker availability check before Authelia password hash step + - Add timeout handling for docker run command + - Provide fallback if automated hash generation fails + +3. **Implement improvements**: + - Update setup-homelab.sh with error handling + - Add pre-flight validation checks + - Enhance error messages with recovery steps + +4. **Update documentation**: + - Document Round 6 findings + - Update troubleshooting guide + - Revise Round 4 references to current round + +5. **Prepare for validation**: + - Create test checklist + - Document expected behavior + - Note any manual testing steps needed + +--- + +*Generated for Round 6 testing of AI-Homelab deployment scripts* +*Focus: Repository optimization, not test server modification* diff --git a/scripts/deploy-homelab.sh b/scripts/deploy-homelab.sh index 64adaf7..8e67f27 100755 --- a/scripts/deploy-homelab.sh +++ b/scripts/deploy-homelab.sh @@ -62,19 +62,33 @@ if [ ! -f "$REPO_DIR/.env" ]; then fi # Check if Docker is installed and running +log_info "Validating Docker installation..." + if ! command -v docker &> /dev/null; then - log_error "Docker is not installed. Please run setup-homelab.sh first." + log_error "Docker is not installed" + log_info "Please run the setup script first:" + log_info " cd ~/AI-Homelab/scripts" + log_info " sudo ./setup-homelab.sh" exit 1 fi -if ! docker info &> /dev/null; then - log_error "Docker daemon is not running or you don't have permission." - log_info "Try: sudo systemctl start docker" - log_info "Or log out and log back in for group changes to take effect" +if ! docker info &> /dev/null 2>&1; then + log_error "Docker daemon is not running or not accessible" + echo "" + log_info "Troubleshooting steps:" + log_info " 1. Start Docker: sudo systemctl start docker" + log_info " 2. Enable Docker on boot: sudo systemctl enable docker" + log_info " 3. Check Docker status: sudo systemctl status docker" + log_info " 4. If recently added to docker group, log out and back in" + log_info " 5. Test access: docker ps" + echo "" + log_info "Current user: $ACTUAL_USER" + log_info "Docker group membership: $(groups $ACTUAL_USER | grep -o docker || echo 'NOT IN DOCKER GROUP')" exit 1 fi log_success "Docker is available and running" +log_info "Docker version: $(docker --version | cut -d' ' -f3 | tr -d ',')" echo "" # Load environment variables for domain check @@ -114,9 +128,26 @@ log_info " - Gluetun (VPN Client)" echo "" # Copy core stack files +log_info "Preparing core stack configuration files..." + +# Clean up any incorrect directory structure from previous runs +if [ -d "/opt/stacks/core/traefik/acme.json" ]; then + log_warning "Removing incorrectly created acme.json directory" + rm -rf /opt/stacks/core/traefik/acme.json +fi +if [ -d "/opt/stacks/core/traefik/traefik.yml" ]; then + log_warning "Removing incorrectly created traefik.yml directory" + rm -rf /opt/stacks/core/traefik/traefik.yml +fi + +# Copy compose file cp "$REPO_DIR/docker-compose/core.yml" /opt/stacks/core/docker-compose.yml + +# Remove existing config directories and copy fresh ones +rm -rf /opt/stacks/core/traefik /opt/stacks/core/authelia cp -r "$REPO_DIR/config-templates/traefik" /opt/stacks/core/ cp -r "$REPO_DIR/config-templates/authelia" /opt/stacks/core/ + cp "$REPO_DIR/.env" /opt/stacks/core/.env # Create acme.json as a file (not directory) with correct permissions @@ -134,47 +165,71 @@ log_success "Traefik email configured" log_info "Configuring Authelia for domain: $DOMAIN..." sed -i "s/your-domain.duckdns.org/${DOMAIN}/g" /opt/stacks/core/authelia/configuration.yml -# Generate Authelia admin password if not already configured -if grep -q "CHANGEME" /opt/stacks/core/authelia/users_database.yml 2>/dev/null || [ ! -f /opt/stacks/core/authelia/users_database.yml ]; then - log_info "Generating Authelia admin credentials..." +# Configure Authelia admin user from setup script +if [ -f /tmp/authelia_admin_credentials.tmp ]; then + log_info "Loading Authelia admin credentials from setup script..." + source /tmp/authelia_admin_credentials.tmp - # Generate a random password if not provided - ADMIN_PASSWORD="${AUTHELIA_ADMIN_PASSWORD:-$(openssl rand -base64 16)}" - - # Generate password hash using Authelia container - log_info "Generating password hash (this may take a moment)..." - PASSWORD_HASH=$(docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2 --password "$ADMIN_PASSWORD" | grep 'Digest:' | awk '{print $2}') - - if [ -z "$PASSWORD_HASH" ]; then - log_error "Failed to generate password hash" - log_info "Using template users_database.yml - please configure manually" - else - # Create users_database.yml with generated credentials + if [ -n "$PASSWORD_HASH" ] && [ -n "$ADMIN_USER" ] && [ -n "$ADMIN_EMAIL" ]; then + log_success "Using credentials: $ADMIN_USER ($ADMIN_EMAIL)" + + # Create users_database.yml with credentials from setup cat > /opt/stacks/core/authelia/users_database.yml << EOF ############################################################### # Users Database # ############################################################### users: - admin: + ${ADMIN_USER}: displayname: "Admin User" password: "${PASSWORD_HASH}" - email: ${ACME_EMAIL} + email: ${ADMIN_EMAIL} groups: - admins - users EOF - log_success "Authelia admin user configured" - log_info "Admin username: admin" - log_info "Admin password: $ADMIN_PASSWORD" - log_warning "SAVE THIS PASSWORD! Writing to /opt/stacks/core/authelia/ADMIN_PASSWORD.txt" + log_success "Authelia admin user configured from setup script" + echo "" + echo "===========================================" + log_info "Authelia Login Credentials:" + echo " Username: $ADMIN_USER" + echo " Password: $ADMIN_PASSWORD" + echo " Email: $ADMIN_EMAIL" + echo "===========================================" + echo "" + log_warning "SAVE THESE CREDENTIALS!" + + # Save password to file for reference echo "$ADMIN_PASSWORD" > /opt/stacks/core/authelia/ADMIN_PASSWORD.txt chmod 600 /opt/stacks/core/authelia/ADMIN_PASSWORD.txt chown $ACTUAL_USER:$ACTUAL_USER /opt/stacks/core/authelia/ADMIN_PASSWORD.txt + log_info "Password also saved to: /opt/stacks/core/authelia/ADMIN_PASSWORD.txt" + echo "" + + # Clean up credentials file + rm -f /tmp/authelia_admin_credentials.tmp + else + log_warning "Incomplete credentials from setup script" + log_info "Using template users_database.yml - please configure manually" fi else - log_info "Authelia users_database.yml already configured" + log_warning "No credentials file found from setup script" + log_info "Using template users_database.yml from config-templates" + log_info "Please run setup-homelab.sh first or configure manually" +fi + +# Clean up old Authelia database if encryption key changed +# This prevents "encryption key does not appear to be valid" errors +if [ -d "/var/lib/docker/volumes/core_authelia-data/_data" ]; then + log_info "Checking for Authelia database encryption key issues..." + # Test if Authelia can start, if not, clean the database + docker compose up -d authelia 2>&1 | grep -q "encryption key" && { + log_warning "Encryption key mismatch detected, cleaning Authelia database..." + docker compose down authelia + sudo rm -rf /var/lib/docker/volumes/core_authelia-data/_data/* + log_success "Authelia database cleaned" + } || log_info "Database check passed" fi # Deploy core stack diff --git a/scripts/setup-homelab.sh b/scripts/setup-homelab.sh index 8a749a5..7044f0b 100755 --- a/scripts/setup-homelab.sh +++ b/scripts/setup-homelab.sh @@ -42,17 +42,62 @@ if [ "$ACTUAL_USER" = "root" ]; then exit 1 fi +# Add trap for cleanup on error +cleanup_on_error() { + local exit_code=$? + if [ $exit_code -ne 0 ]; then + log_error "Script failed with exit code: $exit_code" + echo "" + log_info "Partial setup may have occurred. To resume:" + log_info " 1. Review error messages above" + log_info " 2. Fix the issue if possible" + log_info " 3. Re-run: sudo ./setup-homelab.sh" + echo "" + log_info "The script is designed to be idempotent (safe to re-run)" + fi +} + +trap cleanup_on_error EXIT + log_info "Setting up AI-Homelab for user: $ACTUAL_USER" echo "" +# Step 0: Pre-flight validation +log_info "Step 0/10: Running pre-flight checks..." + +# Check internet connectivity +if ! ping -c 1 -W 2 8.8.8.8 &> /dev/null && ! ping -c 1 -W 2 1.1.1.1 &> /dev/null; then + log_error "No internet connectivity detected" + log_info "Internet access is required for:" + log_info " - Installing packages" + log_info " - Downloading Docker images" + log_info " - Accessing Docker Hub" + exit 1 +fi + +# Check disk space (require at least 5GB free) +AVAILABLE_SPACE=$(df / | tail -1 | awk '{print $4}') +REQUIRED_SPACE=5000000 # 5GB in KB +if [ "$AVAILABLE_SPACE" -lt "$REQUIRED_SPACE" ]; then + log_error "Insufficient disk space on root partition" + log_info "Available: $(($AVAILABLE_SPACE / 1024 / 1024))GB" + log_info "Required: $(($REQUIRED_SPACE / 1024 / 1024))GB" + exit 1 +fi + +log_success "Pre-flight checks passed" +log_info "Internet: Connected" +log_info "Disk space: $(($AVAILABLE_SPACE / 1024 / 1024))GB available" +echo "" + # Step 1: System Update -log_info "Step 1/9: Updating system packages..." +log_info "Step 1/10: Updating system packages..." apt-get update && apt-get upgrade -y log_success "System updated successfully" echo "" # Step 2: Install Required Packages -log_info "Step 2/9: Installing required packages..." +log_info "Step 2/10: Installing required packages..." apt-get install -y \ apt-transport-https \ ca-certificates \ @@ -71,7 +116,7 @@ log_success "Required packages installed" echo "" # Step 3: Install Docker -log_info "Step 3/9: Installing Docker..." +log_info "Step 3/10: Installing Docker..." if command -v docker &> /dev/null && docker --version &> /dev/null; then log_warning "Docker is already installed ($(docker --version))" else @@ -95,7 +140,7 @@ fi echo "" # Step 4: Configure User Groups -log_info "Step 4/9: Configuring user groups..." +log_info "Step 4/10: Configuring user groups..." # Add user to sudo group if not already if groups "$ACTUAL_USER" | grep -q '\bsudo\b'; then @@ -115,7 +160,7 @@ fi echo "" # Step 5: Configure Firewall -log_info "Step 5/9: Configuring firewall..." +log_info "Step 5/10: Configuring firewall..." # Enable UFW if not already enabled if ufw status | grep -q "Status: active"; then log_warning "Firewall is already active" @@ -139,7 +184,7 @@ log_success "HTTP/HTTPS ports allowed in firewall" echo "" # Step 6: Configure SSH -log_info "Step 6/9: Configuring SSH server..." +log_info "Step 6/10: Configuring SSH server..." systemctl enable ssh systemctl start ssh @@ -153,7 +198,23 @@ else fi echo "" # Step 7: Generate Authelia Secrets -log_info "Step 7/9: Generating Authelia authentication secrets..." +log_info "Step 7/10: Generating Authelia authentication secrets..." +echo "" + +# Validate Docker is available for password hash generation +if ! docker info &> /dev/null 2>&1; then + log_error "Docker is not available for password hash generation" + log_info "Docker must be running to generate Authelia password hashes." + log_info "Please ensure:" + log_info " 1. Docker daemon is started: sudo systemctl start docker" + log_info " 2. User can access Docker: docker ps" + log_info " 3. Log out and log back in if recently added to docker group" + echo "" + log_info "After fixing, re-run: sudo ./setup-homelab.sh" + exit 1 +fi + +log_success "Docker is available for password operations" echo "" # Function to generate a secure random secret @@ -236,14 +297,56 @@ while true; do done # Generate password hash using Docker -log_info "Generating password hash (this may take a moment)..." -PASSWORD_HASH=$(docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2 --password "$ADMIN_PASSWORD" | grep '^\$argon2') +log_info "Generating password hash (this may take 30-60 seconds)..." +log_info "Pulling Authelia image if not already present..." -if [ -z "$PASSWORD_HASH" ]; then - log_error "Failed to generate password hash" +# Pull image first to show progress +if ! docker pull authelia/authelia:4.37 2>&1 | grep -E '(Pulling|Downloaded|Already exists|Status)'; then + log_error "Failed to pull Authelia Docker image" + log_info "Please check:" + log_info " 1. Internet connectivity: ping docker.io" + log_info " 2. Docker Hub access: docker search authelia" + log_info " 3. Disk space: df -h" exit 1 fi +echo "" +log_info "Generating password hash..." + +# Generate hash with timeout and better error capture +HASH_OUTPUT=$(timeout 60 docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2 --password "$ADMIN_PASSWORD" 2>&1) +HASH_EXIT_CODE=$? + +if [ $HASH_EXIT_CODE -eq 124 ]; then + log_error "Password hash generation timed out after 60 seconds" + log_info "This is unusual. Please check:" + log_info " 1. System resources: top or htop" + log_info " 2. Docker status: docker ps" + log_info " 3. Try manually: docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2" + exit 1 +elif [ $HASH_EXIT_CODE -ne 0 ]; then + log_error "Failed to generate password hash (exit code: $HASH_EXIT_CODE)" + log_info "Error output:" + echo "$HASH_OUTPUT" + exit 1 +fi + +# Extract hash - format is "Digest: $argon2id$..." +PASSWORD_HASH=$(echo "$HASH_OUTPUT" | grep -oP 'Digest: \K\$argon2.*' || echo "$HASH_OUTPUT" | grep '^\$argon2') + +if [ -z "$PASSWORD_HASH" ]; then + log_error "Failed to extract password hash from output" + log_info "Command output:" + echo "$HASH_OUTPUT" + log_info "" + log_info "You can generate the hash manually after setup:" + log_info " docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2" + log_info " Then edit: /opt/stacks/core/authelia/users_database.yml" + exit 1 +fi + +log_success "Password hash generated successfully" + # Read admin email from .env or prompt ADMIN_EMAIL=$(grep "^ADMIN_EMAIL=" "$REPO_ENV_FILE" | cut -d'=' -f2) if [ -z "$ADMIN_EMAIL" ] || [ "$ADMIN_EMAIL" = "admin@example.com" ]; then @@ -255,17 +358,20 @@ log_success "Admin user configured: $ADMIN_USER" log_success "Password hash generated and will be applied during deployment" # Store the admin credentials for the deployment script +# Include both password and hash so deploy can show the password to user cat > /tmp/authelia_admin_credentials.tmp << EOF ADMIN_USER=$ADMIN_USER ADMIN_EMAIL=$ADMIN_EMAIL +ADMIN_PASSWORD=$ADMIN_PASSWORD PASSWORD_HASH=$PASSWORD_HASH EOF chmod 600 /tmp/authelia_admin_credentials.tmp +log_info "Credentials saved for deployment script" echo "" # Step 8: Create Directory Structure -log_info "Step 8/9: Creating directory structure..." +log_info "Step 8/10: Creating directory structure..." mkdir -p /opt/stacks mkdir -p /opt/dockge/data mkdir -p /mnt/media/{movies,tv,music,books,photos} @@ -288,7 +394,7 @@ log_success "Directory structure created" echo "" # Step 9: Create Docker Networks -log_info "Step 9/9: Creating Docker networks..." +log_info "Step 9/10: Creating Docker networks..." su - "$ACTUAL_USER" -c "docker network create homelab-network 2>/dev/null || true" su - "$ACTUAL_USER" -c "docker network create traefik-network 2>/dev/null || true" su - "$ACTUAL_USER" -c "docker network create media-network 2>/dev/null || true" @@ -296,8 +402,8 @@ su - "$ACTUAL_USER" -c "docker network create dockerproxy-network 2>/dev/null || log_success "Docker networks created" echo "" -# Optional: Detect and Install NVIDIA Drivers (if applicable) -log_info "Optional: Checking for NVIDIA GPU..." +# Step 10: Optional - Detect and Install NVIDIA Drivers +log_info "Step 10/10 (Optional): Checking for NVIDIA GPU..." # Detect NVIDIA GPU if lspci | grep -i nvidia > /dev/null; then