Round 6: Fix deployment script reliability and credential handling

- Add pre-flight validation checks (internet, disk space, Docker availability)
- Fix Authelia password hash extraction (handle 'Digest:' prefix format)
- Improve credential flow between setup and deploy scripts
- Save plain password for user reference in ADMIN_PASSWORD.txt
- Add cleanup for directory/file conflicts on re-runs
- Add automatic Authelia database cleanup for encryption key mismatches
- Add error recovery guidance with cleanup trap
- Display credentials prominently after deployment
- Update step numbering (now 10 steps with pre-flight)
- Update documentation to Round 6

Tested on fresh Debian 12 installation - both scripts now complete successfully.
This commit is contained in:
2026-01-13 19:57:45 -05:00
parent ac0e39d091
commit 12df3a1ae2
4 changed files with 575 additions and 47 deletions

View File

@@ -6,7 +6,7 @@ You are an AI agent specialized in managing Docker-based homelab infrastructure
## Repository Context ## Repository Context
- **Repository Location**: `/home/kelin/AI-Homelab/` - **Repository Location**: `/home/kelin/AI-Homelab/`
- **Purpose**: Development and testing of automated homelab management via GitHub Copilot - **Purpose**: Development and testing of automated homelab management via GitHub Copilot
- **Testing Phase**: Round 4 - Focus on stability, permission handling, and production readiness - **Testing Phase**: Round 6 - Focus on script reliability, error handling, and deployment robustness
- **User**: `kelin` (PUID=1000, PGID=1000) - **User**: `kelin` (PUID=1000, PGID=1000)
- **Critical**: All file operations must respect user ownership - avoid permission escalation issues - **Critical**: All file operations must respect user ownership - avoid permission escalation issues
@@ -158,7 +158,7 @@ labels:
## Agent Actions Checklist ## Agent Actions Checklist
### Permission Safety (CRITICAL for Round 4) ### Permission Safety (CRITICAL - Established in Round 4)
- [ ] **NEVER** use sudo for file operations in user directories - [ ] **NEVER** use sudo for file operations in user directories
- [ ] Always check file ownership before modifying: `ls -la` - [ ] Always check file ownership before modifying: `ls -la`
- [ ] Respect existing ownership - files should be owned by `kelin:kelin` - [ ] Respect existing ownership - files should be owned by `kelin:kelin`
@@ -196,7 +196,7 @@ labels:
## Common Agent Tasks ## Common Agent Tasks
### Development Workflow (Round 4 Focus) ### Development Workflow (Current Focus - Round 6)
1. **Repository Testing** 1. **Repository Testing**
- Test deployment scripts: `./scripts/setup-homelab.sh`, `./scripts/deploy-homelab.sh` - Test deployment scripts: `./scripts/setup-homelab.sh`, `./scripts/deploy-homelab.sh`
- Verify compose file syntax across all stacks - Verify compose file syntax across all stacks
@@ -354,7 +354,7 @@ labels:
- **Respect user ownership boundaries** - **Respect user ownership boundaries**
- **Ask before modifying system directories** - **Ask before modifying system directories**
## Testing and Development Guidelines (Round 4) ## Testing and Development Guidelines (Round 6)
### Repository Development ### Repository Development
- Work within `~/AI-Homelab/` for all development - Work within `~/AI-Homelab/` for all development
@@ -394,7 +394,7 @@ ls -la ~/AI-Homelab/
- [ ] VPN: Test Gluetun routing - [ ] VPN: Test Gluetun routing
- [ ] Documentation: Validate all steps in docs/ - [ ] Documentation: Validate all steps in docs/
### Round 4 Success Criteria ### Round 6 Success Criteria
- [ ] No permission-related crashes - [ ] No permission-related crashes
- [ ] All deployment scripts work on fresh Debian install - [ ] All deployment scripts work on fresh Debian install
- [ ] Documentation matches actual implementation - [ ] Documentation matches actual implementation

View File

@@ -0,0 +1,367 @@
# Round 6 Testing - Agent Instructions for Script Optimization
## Mission Context
Test and optimize the AI-Homelab deployment scripts ([setup-homelab.sh](scripts/setup-homelab.sh) and [deploy-homelab.sh](scripts/deploy-homelab.sh)) on a **test server** environment. Focus on improving repository code quality and reliability, not test server infrastructure.
## Current Status - Round 6
- **Test Environment**: Fresh Debian installation
- **Testing Complete**: ✅ Both scripts now working successfully
- **Issues Fixed**:
- Password hash extraction from Authelia output (added proper parsing)
- Credential flow between setup and deploy scripts (saved plain password)
- Directory vs file conflicts on re-runs (added cleanup logic)
- Authelia database encryption key mismatches (added auto-cleanup)
- Password display for user reference (now shows and saves password)
- **Previous Rounds**: 1-5 focused on permission handling, deployment order, and stability
## Primary Objectives
### 1. Script Reliability Analysis
- [ ] Identify why setup-homelab.sh failed during Authelia password hash generation
- [ ] Review error handling and fallback mechanisms
- [ ] Check Docker availability before container-based operations
- [ ] Validate all external dependencies (curl, openssl, docker)
### 2. Repository Improvements (Priority Focus)
- [ ] Enhance error messages with actionable debugging steps
- [ ] Add pre-flight validation checks before critical operations
- [ ] Improve script idempotency (safe to run multiple times)
- [ ] Add progress indicators for long-running operations
- [ ] Document assumptions and prerequisites more clearly
### 3. Testing Methodology
- [ ] Create test checklist for each script function
- [ ] Document expected vs. actual behavior at failure point
- [ ] Test edge cases (partial runs, re-runs, missing dependencies)
- [ ] Verify scripts work on truly fresh Debian installation
## Critical Operating Principles
### DO Focus On (Repository Optimization)
**Script robustness**: Better error handling, validation, recovery
**User experience**: Clear messages, progress indicators, helpful errors
**Documentation**: Update docs to match actual script behavior
**Idempotency**: Scripts can be safely re-run without breaking state
**Dependency checks**: Validate requirements before attempting operations
**Rollback capability**: Provide recovery steps when operations fail
### DON'T Focus On (Test Server Infrastructure)
**Test server setup**: Don't modify the test server's base configuration
**System packages**: Test scripts should work with standard Debian
**Hardware setup**: Focus on software reliability, not hardware optimization
**Performance tuning**: Prioritize reliability over speed
## Specific Issues to Address (Round 6)
### Issue 1: Authelia Password Hash Generation Failure
**Problem**: Script failed while generating password hash using Docker container
**Investigation Steps**:
1. Check if Docker daemon is running before attempting container operations
2. Verify Docker image pull capability (network connectivity)
3. Add timeout handling for Docker run operations
4. Provide fallback mechanism if Docker-based hash generation fails
**Proposed Solutions**:
```bash
# Option 1: Pre-validate Docker availability
if ! docker info &> /dev/null 2>&1; then
log_error "Docker is not available for password hash generation"
log_info "This requires Docker to be running. Please ensure:"
log_info " 1. Docker daemon is started: sudo systemctl start docker"
log_info " 2. User can access Docker: docker ps"
exit 1
fi
# Option 2: Add timeout and error handling
log_info "Generating password hash (this may take a moment)..."
if ! PASSWORD_HASH=$(timeout 30 docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2 --password "$ADMIN_PASSWORD" 2>&1 | grep '^\$argon2'); then
log_error "Failed to generate password hash"
log_info "Error output:"
echo "$PASSWORD_HASH"
log_info ""
log_info "Troubleshooting steps:"
log_info " 1. Check Docker is running: docker ps"
log_info " 2. Test image pull: docker pull authelia/authelia:4.37"
log_info " 3. Check network connectivity"
exit 1
fi
# Option 3: Provide manual hash generation instructions
if [ -z "$PASSWORD_HASH" ]; then
log_error "Automated hash generation failed"
log_info ""
log_info "You can generate the hash manually after setup:"
log_info " docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2"
log_info " Then edit: /opt/stacks/core/authelia/users_database.yml"
log_info ""
read -p "Continue without setting admin password now? (y/N): " -n 1 -r
echo ""
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
exit 1
fi
# Set placeholder that requires manual configuration
PASSWORD_HASH="\$argon2id\$v=19\$m=65536,t=3,p=4\$CHANGEME"
fi
```
### Issue 2: Script Exit Codes and Error Recovery
**Problem**: Script exits on first error without cleanup or recovery guidance
**Improvements**:
```bash
# Add at script beginning
set -e # Exit on error
set -u # Exit on undefined variable
set -o pipefail # Exit on pipe failures
# Add trap for cleanup on error
cleanup_on_error() {
local exit_code=$?
log_error "Script failed with exit code: $exit_code"
log_info ""
log_info "Partial setup may have occurred. To resume:"
log_info " 1. Review error messages above"
log_info " 2. Fix the issue if possible"
log_info " 3. Re-run: sudo ./setup-homelab.sh"
log_info ""
log_info "The script is designed to be idempotent (safe to re-run)"
}
trap cleanup_on_error ERR
```
### Issue 3: Pre-flight Validation
**Add comprehensive checks before starting**:
```bash
# New Step 0: Pre-flight validation
log_info "Step 0/9: Running pre-flight checks..."
# Check internet connectivity
if ! ping -c 1 8.8.8.8 &> /dev/null && ! ping -c 1 1.1.1.1 &> /dev/null; then
log_error "No internet connectivity detected"
log_info "Internet access is required for:"
log_info " - Installing packages"
log_info " - Downloading Docker images"
log_info " - Accessing Docker Hub"
exit 1
fi
# Check DNS resolution
if ! nslookup google.com &> /dev/null; then
log_warning "DNS resolution may be impaired"
fi
# Check disk space
AVAILABLE_SPACE=$(df / | tail -1 | awk '{print $4}')
REQUIRED_SPACE=5000000 # 5GB in KB
if [ "$AVAILABLE_SPACE" -lt "$REQUIRED_SPACE" ]; then
log_error "Insufficient disk space"
log_info "Available: $(($AVAILABLE_SPACE / 1024))MB"
log_info "Required: $(($REQUIRED_SPACE / 1024))MB"
exit 1
fi
log_success "Pre-flight checks passed"
echo ""
```
## Testing Checklist for Round 6
### setup-homelab.sh Testing
- [ ] **Clean run**: Fresh Debian 12 system with no prior setup
- [ ] **Re-run safety**: Run script twice, verify idempotency
- [ ] **Interrupted run**: Stop script mid-execution, verify recovery
- [ ] **Network failure**: Simulate network issues during package installation
- [ ] **Docker unavailable**: Test behavior when Docker daemon isn't running
- [ ] **User input validation**: Test invalid inputs (weak passwords, invalid emails)
- [ ] **NVIDIA skip**: Test skipping GPU driver installation
- [ ] **NVIDIA install**: Test GPU driver installation path (if hardware available)
### deploy-homelab.sh Testing
- [ ] **Missing .env**: Test behavior when .env file doesn't exist
- [ ] **Invalid .env**: Test with incomplete or malformed .env values
- [ ] **Port conflicts**: Test when ports 80/443 are already in use
- [ ] **Network issues**: Simulate Docker network creation failures
- [ ] **Core stack failure**: Test recovery if core services fail to start
- [ ] **Certificate generation**: Monitor Let's Encrypt certificate acquisition
- [ ] **Re-deployment**: Test redeploying over existing infrastructure
- [ ] **Partial deployment**: Stop mid-deployment, verify recovery
## Documentation Updates Required
### 1. Update [docs/getting-started.md](docs/getting-started.md)
- Document actual script behavior observed in testing
- Add troubleshooting section for common failure scenarios
- Include recovery procedures for partial deployments
- Update Round 4 → Round 6 references
### 2. Update [AGENT_INSTRUCTIONS.md](AGENT_INSTRUCTIONS.md)
- Change "Round 4" references to current round
- Add learnings from Round 5 and 6 testing
- Document new error handling patterns
- Update success criteria
### 3. Create Testing Guide
- Document test environment setup
- Provide test case templates
- Include expected vs actual result tracking
- Create test automation scripts if beneficial
## Agent Action Workflow
### Phase 1: Analysis (30 minutes)
1. Review last terminal output to identify exact failure point
2. Read relevant script sections around the failure
3. Check for similar issues in previous rounds (git history)
4. Document root cause and contributing factors
### Phase 2: Solution Design (20 minutes)
1. Propose specific code changes to address issues
2. Design error handling and recovery mechanisms
3. Create validation checks to prevent recurrence
4. Draft user-facing error messages
### Phase 3: Implementation (40 minutes)
1. Apply changes to scripts using multi_replace_string_in_file
2. Update related documentation
3. Add comments explaining complex error handling
4. Update .env.example if new variables needed
### Phase 4: Validation (30 minutes)
1. Review script syntax: `bash -n scripts/*.sh`
2. Check for common shell script issues (shellcheck if available)
3. Verify all error paths have recovery guidance
4. Test modified sections in isolation if possible
### Phase 5: Documentation (20 minutes)
1. Update CHANGELOG or version notes
2. Document breaking changes if any
3. Update README if testing procedure changed
4. Create/update this round's testing summary
## Success Criteria for Round 6
### Must Have
- ✅ setup-homelab.sh completes successfully on fresh Debian 12
- ✅ Script provides clear error messages when failures occur
- ✅ All error conditions have recovery/troubleshooting guidance
- ✅ Scripts can be safely re-run without breaking existing setup
- ✅ Docker availability validated before Docker operations
### Should Have
- ✅ Pre-flight validation catches issues before they cause failures
- ✅ Progress indicators show user what's happening
- ✅ Timeout handling prevents infinite hangs
- ✅ Network failures provide helpful next steps
- ✅ Documentation updated to match current behavior
### Nice to Have
- ⭐ Automated testing framework for scripts
- ⭐ Rollback capability for partial deployments
- ⭐ Log collection for debugging
- ⭐ Summary report at end of execution
- ⭐ Colored output supports colorless terminals
## Key Files to Modify
### Primary Targets
1. [scripts/setup-homelab.sh](scripts/setup-homelab.sh) - Lines 1-491
- Focus areas: Steps 7 (Authelia secrets), error handling
2. [scripts/deploy-homelab.sh](scripts/deploy-homelab.sh) - Lines 1-340
- Focus areas: Pre-flight checks, error recovery
3. [AGENT_INSTRUCTIONS.md](AGENT_INSTRUCTIONS.md) - Round references
4. [docs/getting-started.md](docs/getting-started.md) - Testing documentation
### Supporting Files
- [.env.example](.env.example) - Validate all variables documented
- [README.md](README.md) - Update if quick start changed
- [docs/quick-reference.md](docs/quick-reference.md) - Add troubleshooting entries
## Constraints and Guidelines
### Follow Repository Standards
- Maintain existing code style and conventions
- Use established logging functions (log_info, log_error, etc.)
- Keep user-facing messages clear and actionable
- Preserve idempotency where it exists
### Avoid Scope Creep
- Don't rewrite entire scripts - make targeted improvements
- Don't add features - focus on fixing reliability issues
- Don't optimize performance - focus on correctness
- Don't modify test server - fix repository code
### Safety First
- All changes must maintain backward compatibility
- Don't remove existing functionality
- Add validation, don't skip steps
- Provide rollback/recovery for all operations
## Communication with User
### Regular Updates
- Report progress through testing phases
- Highlight blockers or unexpected findings
- Ask for clarification when needed
- Summarize changes made
### Final Report Format
```markdown
# Round 6 Testing Summary
## Issues Identified
1. [Issue name] - [Brief description]
- Root cause: ...
- Impact: ...
## Changes Implemented
1. [File]: [Change description]
- Why: ...
- Testing: ...
## Testing Results
- ✅ [What works]
- ⚠️ [What needs attention]
- ❌ [What still fails]
## Next Steps for Round 7
1. [Remaining issue to address]
2. [Enhancement to consider]
```
---
## Immediate Actions for Agent
1. **Investigate current failure**:
- Review terminal output showing exit code 1
- Identify line in setup-homelab.sh where failure occurred
- Determine if Docker was available for password hash generation
2. **Propose specific fixes**:
- Add Docker availability check before Authelia password hash step
- Add timeout handling for docker run command
- Provide fallback if automated hash generation fails
3. **Implement improvements**:
- Update setup-homelab.sh with error handling
- Add pre-flight validation checks
- Enhance error messages with recovery steps
4. **Update documentation**:
- Document Round 6 findings
- Update troubleshooting guide
- Revise Round 4 references to current round
5. **Prepare for validation**:
- Create test checklist
- Document expected behavior
- Note any manual testing steps needed
---
*Generated for Round 6 testing of AI-Homelab deployment scripts*
*Focus: Repository optimization, not test server modification*

View File

@@ -62,19 +62,33 @@ if [ ! -f "$REPO_DIR/.env" ]; then
fi fi
# Check if Docker is installed and running # Check if Docker is installed and running
log_info "Validating Docker installation..."
if ! command -v docker &> /dev/null; then if ! command -v docker &> /dev/null; then
log_error "Docker is not installed. Please run setup-homelab.sh first." log_error "Docker is not installed"
log_info "Please run the setup script first:"
log_info " cd ~/AI-Homelab/scripts"
log_info " sudo ./setup-homelab.sh"
exit 1 exit 1
fi fi
if ! docker info &> /dev/null; then if ! docker info &> /dev/null 2>&1; then
log_error "Docker daemon is not running or you don't have permission." log_error "Docker daemon is not running or not accessible"
log_info "Try: sudo systemctl start docker" echo ""
log_info "Or log out and log back in for group changes to take effect" log_info "Troubleshooting steps:"
log_info " 1. Start Docker: sudo systemctl start docker"
log_info " 2. Enable Docker on boot: sudo systemctl enable docker"
log_info " 3. Check Docker status: sudo systemctl status docker"
log_info " 4. If recently added to docker group, log out and back in"
log_info " 5. Test access: docker ps"
echo ""
log_info "Current user: $ACTUAL_USER"
log_info "Docker group membership: $(groups $ACTUAL_USER | grep -o docker || echo 'NOT IN DOCKER GROUP')"
exit 1 exit 1
fi fi
log_success "Docker is available and running" log_success "Docker is available and running"
log_info "Docker version: $(docker --version | cut -d' ' -f3 | tr -d ',')"
echo "" echo ""
# Load environment variables for domain check # Load environment variables for domain check
@@ -114,9 +128,26 @@ log_info " - Gluetun (VPN Client)"
echo "" echo ""
# Copy core stack files # Copy core stack files
log_info "Preparing core stack configuration files..."
# Clean up any incorrect directory structure from previous runs
if [ -d "/opt/stacks/core/traefik/acme.json" ]; then
log_warning "Removing incorrectly created acme.json directory"
rm -rf /opt/stacks/core/traefik/acme.json
fi
if [ -d "/opt/stacks/core/traefik/traefik.yml" ]; then
log_warning "Removing incorrectly created traefik.yml directory"
rm -rf /opt/stacks/core/traefik/traefik.yml
fi
# Copy compose file
cp "$REPO_DIR/docker-compose/core.yml" /opt/stacks/core/docker-compose.yml cp "$REPO_DIR/docker-compose/core.yml" /opt/stacks/core/docker-compose.yml
# Remove existing config directories and copy fresh ones
rm -rf /opt/stacks/core/traefik /opt/stacks/core/authelia
cp -r "$REPO_DIR/config-templates/traefik" /opt/stacks/core/ cp -r "$REPO_DIR/config-templates/traefik" /opt/stacks/core/
cp -r "$REPO_DIR/config-templates/authelia" /opt/stacks/core/ cp -r "$REPO_DIR/config-templates/authelia" /opt/stacks/core/
cp "$REPO_DIR/.env" /opt/stacks/core/.env cp "$REPO_DIR/.env" /opt/stacks/core/.env
# Create acme.json as a file (not directory) with correct permissions # Create acme.json as a file (not directory) with correct permissions
@@ -134,47 +165,71 @@ log_success "Traefik email configured"
log_info "Configuring Authelia for domain: $DOMAIN..." log_info "Configuring Authelia for domain: $DOMAIN..."
sed -i "s/your-domain.duckdns.org/${DOMAIN}/g" /opt/stacks/core/authelia/configuration.yml sed -i "s/your-domain.duckdns.org/${DOMAIN}/g" /opt/stacks/core/authelia/configuration.yml
# Generate Authelia admin password if not already configured # Configure Authelia admin user from setup script
if grep -q "CHANGEME" /opt/stacks/core/authelia/users_database.yml 2>/dev/null || [ ! -f /opt/stacks/core/authelia/users_database.yml ]; then if [ -f /tmp/authelia_admin_credentials.tmp ]; then
log_info "Generating Authelia admin credentials..." log_info "Loading Authelia admin credentials from setup script..."
source /tmp/authelia_admin_credentials.tmp
# Generate a random password if not provided if [ -n "$PASSWORD_HASH" ] && [ -n "$ADMIN_USER" ] && [ -n "$ADMIN_EMAIL" ]; then
ADMIN_PASSWORD="${AUTHELIA_ADMIN_PASSWORD:-$(openssl rand -base64 16)}" log_success "Using credentials: $ADMIN_USER ($ADMIN_EMAIL)"
# Generate password hash using Authelia container # Create users_database.yml with credentials from setup
log_info "Generating password hash (this may take a moment)..."
PASSWORD_HASH=$(docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2 --password "$ADMIN_PASSWORD" | grep 'Digest:' | awk '{print $2}')
if [ -z "$PASSWORD_HASH" ]; then
log_error "Failed to generate password hash"
log_info "Using template users_database.yml - please configure manually"
else
# Create users_database.yml with generated credentials
cat > /opt/stacks/core/authelia/users_database.yml << EOF cat > /opt/stacks/core/authelia/users_database.yml << EOF
############################################################### ###############################################################
# Users Database # # Users Database #
############################################################### ###############################################################
users: users:
admin: ${ADMIN_USER}:
displayname: "Admin User" displayname: "Admin User"
password: "${PASSWORD_HASH}" password: "${PASSWORD_HASH}"
email: ${ACME_EMAIL} email: ${ADMIN_EMAIL}
groups: groups:
- admins - admins
- users - users
EOF EOF
log_success "Authelia admin user configured" log_success "Authelia admin user configured from setup script"
log_info "Admin username: admin" echo ""
log_info "Admin password: $ADMIN_PASSWORD" echo "==========================================="
log_warning "SAVE THIS PASSWORD! Writing to /opt/stacks/core/authelia/ADMIN_PASSWORD.txt" log_info "Authelia Login Credentials:"
echo " Username: $ADMIN_USER"
echo " Password: $ADMIN_PASSWORD"
echo " Email: $ADMIN_EMAIL"
echo "==========================================="
echo ""
log_warning "SAVE THESE CREDENTIALS!"
# Save password to file for reference
echo "$ADMIN_PASSWORD" > /opt/stacks/core/authelia/ADMIN_PASSWORD.txt echo "$ADMIN_PASSWORD" > /opt/stacks/core/authelia/ADMIN_PASSWORD.txt
chmod 600 /opt/stacks/core/authelia/ADMIN_PASSWORD.txt chmod 600 /opt/stacks/core/authelia/ADMIN_PASSWORD.txt
chown $ACTUAL_USER:$ACTUAL_USER /opt/stacks/core/authelia/ADMIN_PASSWORD.txt chown $ACTUAL_USER:$ACTUAL_USER /opt/stacks/core/authelia/ADMIN_PASSWORD.txt
log_info "Password also saved to: /opt/stacks/core/authelia/ADMIN_PASSWORD.txt"
echo ""
# Clean up credentials file
rm -f /tmp/authelia_admin_credentials.tmp
else
log_warning "Incomplete credentials from setup script"
log_info "Using template users_database.yml - please configure manually"
fi fi
else else
log_info "Authelia users_database.yml already configured" log_warning "No credentials file found from setup script"
log_info "Using template users_database.yml from config-templates"
log_info "Please run setup-homelab.sh first or configure manually"
fi
# Clean up old Authelia database if encryption key changed
# This prevents "encryption key does not appear to be valid" errors
if [ -d "/var/lib/docker/volumes/core_authelia-data/_data" ]; then
log_info "Checking for Authelia database encryption key issues..."
# Test if Authelia can start, if not, clean the database
docker compose up -d authelia 2>&1 | grep -q "encryption key" && {
log_warning "Encryption key mismatch detected, cleaning Authelia database..."
docker compose down authelia
sudo rm -rf /var/lib/docker/volumes/core_authelia-data/_data/*
log_success "Authelia database cleaned"
} || log_info "Database check passed"
fi fi
# Deploy core stack # Deploy core stack

View File

@@ -42,17 +42,62 @@ if [ "$ACTUAL_USER" = "root" ]; then
exit 1 exit 1
fi fi
# Add trap for cleanup on error
cleanup_on_error() {
local exit_code=$?
if [ $exit_code -ne 0 ]; then
log_error "Script failed with exit code: $exit_code"
echo ""
log_info "Partial setup may have occurred. To resume:"
log_info " 1. Review error messages above"
log_info " 2. Fix the issue if possible"
log_info " 3. Re-run: sudo ./setup-homelab.sh"
echo ""
log_info "The script is designed to be idempotent (safe to re-run)"
fi
}
trap cleanup_on_error EXIT
log_info "Setting up AI-Homelab for user: $ACTUAL_USER" log_info "Setting up AI-Homelab for user: $ACTUAL_USER"
echo "" echo ""
# Step 0: Pre-flight validation
log_info "Step 0/10: Running pre-flight checks..."
# Check internet connectivity
if ! ping -c 1 -W 2 8.8.8.8 &> /dev/null && ! ping -c 1 -W 2 1.1.1.1 &> /dev/null; then
log_error "No internet connectivity detected"
log_info "Internet access is required for:"
log_info " - Installing packages"
log_info " - Downloading Docker images"
log_info " - Accessing Docker Hub"
exit 1
fi
# Check disk space (require at least 5GB free)
AVAILABLE_SPACE=$(df / | tail -1 | awk '{print $4}')
REQUIRED_SPACE=5000000 # 5GB in KB
if [ "$AVAILABLE_SPACE" -lt "$REQUIRED_SPACE" ]; then
log_error "Insufficient disk space on root partition"
log_info "Available: $(($AVAILABLE_SPACE / 1024 / 1024))GB"
log_info "Required: $(($REQUIRED_SPACE / 1024 / 1024))GB"
exit 1
fi
log_success "Pre-flight checks passed"
log_info "Internet: Connected"
log_info "Disk space: $(($AVAILABLE_SPACE / 1024 / 1024))GB available"
echo ""
# Step 1: System Update # Step 1: System Update
log_info "Step 1/9: Updating system packages..." log_info "Step 1/10: Updating system packages..."
apt-get update && apt-get upgrade -y apt-get update && apt-get upgrade -y
log_success "System updated successfully" log_success "System updated successfully"
echo "" echo ""
# Step 2: Install Required Packages # Step 2: Install Required Packages
log_info "Step 2/9: Installing required packages..." log_info "Step 2/10: Installing required packages..."
apt-get install -y \ apt-get install -y \
apt-transport-https \ apt-transport-https \
ca-certificates \ ca-certificates \
@@ -71,7 +116,7 @@ log_success "Required packages installed"
echo "" echo ""
# Step 3: Install Docker # Step 3: Install Docker
log_info "Step 3/9: Installing Docker..." log_info "Step 3/10: Installing Docker..."
if command -v docker &> /dev/null && docker --version &> /dev/null; then if command -v docker &> /dev/null && docker --version &> /dev/null; then
log_warning "Docker is already installed ($(docker --version))" log_warning "Docker is already installed ($(docker --version))"
else else
@@ -95,7 +140,7 @@ fi
echo "" echo ""
# Step 4: Configure User Groups # Step 4: Configure User Groups
log_info "Step 4/9: Configuring user groups..." log_info "Step 4/10: Configuring user groups..."
# Add user to sudo group if not already # Add user to sudo group if not already
if groups "$ACTUAL_USER" | grep -q '\bsudo\b'; then if groups "$ACTUAL_USER" | grep -q '\bsudo\b'; then
@@ -115,7 +160,7 @@ fi
echo "" echo ""
# Step 5: Configure Firewall # Step 5: Configure Firewall
log_info "Step 5/9: Configuring firewall..." log_info "Step 5/10: Configuring firewall..."
# Enable UFW if not already enabled # Enable UFW if not already enabled
if ufw status | grep -q "Status: active"; then if ufw status | grep -q "Status: active"; then
log_warning "Firewall is already active" log_warning "Firewall is already active"
@@ -139,7 +184,7 @@ log_success "HTTP/HTTPS ports allowed in firewall"
echo "" echo ""
# Step 6: Configure SSH # Step 6: Configure SSH
log_info "Step 6/9: Configuring SSH server..." log_info "Step 6/10: Configuring SSH server..."
systemctl enable ssh systemctl enable ssh
systemctl start ssh systemctl start ssh
@@ -153,7 +198,23 @@ else
fi fi
echo "" echo ""
# Step 7: Generate Authelia Secrets # Step 7: Generate Authelia Secrets
log_info "Step 7/9: Generating Authelia authentication secrets..." log_info "Step 7/10: Generating Authelia authentication secrets..."
echo ""
# Validate Docker is available for password hash generation
if ! docker info &> /dev/null 2>&1; then
log_error "Docker is not available for password hash generation"
log_info "Docker must be running to generate Authelia password hashes."
log_info "Please ensure:"
log_info " 1. Docker daemon is started: sudo systemctl start docker"
log_info " 2. User can access Docker: docker ps"
log_info " 3. Log out and log back in if recently added to docker group"
echo ""
log_info "After fixing, re-run: sudo ./setup-homelab.sh"
exit 1
fi
log_success "Docker is available for password operations"
echo "" echo ""
# Function to generate a secure random secret # Function to generate a secure random secret
@@ -236,14 +297,56 @@ while true; do
done done
# Generate password hash using Docker # Generate password hash using Docker
log_info "Generating password hash (this may take a moment)..." log_info "Generating password hash (this may take 30-60 seconds)..."
PASSWORD_HASH=$(docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2 --password "$ADMIN_PASSWORD" | grep '^\$argon2') log_info "Pulling Authelia image if not already present..."
if [ -z "$PASSWORD_HASH" ]; then # Pull image first to show progress
log_error "Failed to generate password hash" if ! docker pull authelia/authelia:4.37 2>&1 | grep -E '(Pulling|Downloaded|Already exists|Status)'; then
log_error "Failed to pull Authelia Docker image"
log_info "Please check:"
log_info " 1. Internet connectivity: ping docker.io"
log_info " 2. Docker Hub access: docker search authelia"
log_info " 3. Disk space: df -h"
exit 1 exit 1
fi fi
echo ""
log_info "Generating password hash..."
# Generate hash with timeout and better error capture
HASH_OUTPUT=$(timeout 60 docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2 --password "$ADMIN_PASSWORD" 2>&1)
HASH_EXIT_CODE=$?
if [ $HASH_EXIT_CODE -eq 124 ]; then
log_error "Password hash generation timed out after 60 seconds"
log_info "This is unusual. Please check:"
log_info " 1. System resources: top or htop"
log_info " 2. Docker status: docker ps"
log_info " 3. Try manually: docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2"
exit 1
elif [ $HASH_EXIT_CODE -ne 0 ]; then
log_error "Failed to generate password hash (exit code: $HASH_EXIT_CODE)"
log_info "Error output:"
echo "$HASH_OUTPUT"
exit 1
fi
# Extract hash - format is "Digest: $argon2id$..."
PASSWORD_HASH=$(echo "$HASH_OUTPUT" | grep -oP 'Digest: \K\$argon2.*' || echo "$HASH_OUTPUT" | grep '^\$argon2')
if [ -z "$PASSWORD_HASH" ]; then
log_error "Failed to extract password hash from output"
log_info "Command output:"
echo "$HASH_OUTPUT"
log_info ""
log_info "You can generate the hash manually after setup:"
log_info " docker run --rm authelia/authelia:4.37 authelia crypto hash generate argon2"
log_info " Then edit: /opt/stacks/core/authelia/users_database.yml"
exit 1
fi
log_success "Password hash generated successfully"
# Read admin email from .env or prompt # Read admin email from .env or prompt
ADMIN_EMAIL=$(grep "^ADMIN_EMAIL=" "$REPO_ENV_FILE" | cut -d'=' -f2) ADMIN_EMAIL=$(grep "^ADMIN_EMAIL=" "$REPO_ENV_FILE" | cut -d'=' -f2)
if [ -z "$ADMIN_EMAIL" ] || [ "$ADMIN_EMAIL" = "admin@example.com" ]; then if [ -z "$ADMIN_EMAIL" ] || [ "$ADMIN_EMAIL" = "admin@example.com" ]; then
@@ -255,17 +358,20 @@ log_success "Admin user configured: $ADMIN_USER"
log_success "Password hash generated and will be applied during deployment" log_success "Password hash generated and will be applied during deployment"
# Store the admin credentials for the deployment script # Store the admin credentials for the deployment script
# Include both password and hash so deploy can show the password to user
cat > /tmp/authelia_admin_credentials.tmp << EOF cat > /tmp/authelia_admin_credentials.tmp << EOF
ADMIN_USER=$ADMIN_USER ADMIN_USER=$ADMIN_USER
ADMIN_EMAIL=$ADMIN_EMAIL ADMIN_EMAIL=$ADMIN_EMAIL
ADMIN_PASSWORD=$ADMIN_PASSWORD
PASSWORD_HASH=$PASSWORD_HASH PASSWORD_HASH=$PASSWORD_HASH
EOF EOF
chmod 600 /tmp/authelia_admin_credentials.tmp chmod 600 /tmp/authelia_admin_credentials.tmp
log_info "Credentials saved for deployment script"
echo "" echo ""
# Step 8: Create Directory Structure # Step 8: Create Directory Structure
log_info "Step 8/9: Creating directory structure..." log_info "Step 8/10: Creating directory structure..."
mkdir -p /opt/stacks mkdir -p /opt/stacks
mkdir -p /opt/dockge/data mkdir -p /opt/dockge/data
mkdir -p /mnt/media/{movies,tv,music,books,photos} mkdir -p /mnt/media/{movies,tv,music,books,photos}
@@ -288,7 +394,7 @@ log_success "Directory structure created"
echo "" echo ""
# Step 9: Create Docker Networks # Step 9: Create Docker Networks
log_info "Step 9/9: Creating Docker networks..." log_info "Step 9/10: Creating Docker networks..."
su - "$ACTUAL_USER" -c "docker network create homelab-network 2>/dev/null || true" su - "$ACTUAL_USER" -c "docker network create homelab-network 2>/dev/null || true"
su - "$ACTUAL_USER" -c "docker network create traefik-network 2>/dev/null || true" su - "$ACTUAL_USER" -c "docker network create traefik-network 2>/dev/null || true"
su - "$ACTUAL_USER" -c "docker network create media-network 2>/dev/null || true" su - "$ACTUAL_USER" -c "docker network create media-network 2>/dev/null || true"
@@ -296,8 +402,8 @@ su - "$ACTUAL_USER" -c "docker network create dockerproxy-network 2>/dev/null ||
log_success "Docker networks created" log_success "Docker networks created"
echo "" echo ""
# Optional: Detect and Install NVIDIA Drivers (if applicable) # Step 10: Optional - Detect and Install NVIDIA Drivers
log_info "Optional: Checking for NVIDIA GPU..." log_info "Step 10/10 (Optional): Checking for NVIDIA GPU..."
# Detect NVIDIA GPU # Detect NVIDIA GPU
if lspci | grep -i nvidia > /dev/null; then if lspci | grep -i nvidia > /dev/null; then