Skip to content

Dataverse Infrastructure as a Service - Notes & Concept

Notifications You must be signed in to change notification settings

ChrispyBacon-dev/IaaS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 

Repository files navigation


IaaS Project Concept: Self-Hosted Docker Environment

1. Core Goal

The primary objective is to create a resilient and flexible Docker hosting environment that is:

  • Host Independent: Easily migrate applications or recover the setup without depending on specific physical or virtual host servers.
  • Disaster Recovery Friendly: Simplify the process of restoring services and data in case of a host failure.

2. Key Technologies & Components

  • Host Operating System: any Linux Server or AWS Cloud Instances (as the base for Docker hosts).
  • Containerization: Docker (to run applications in isolated environments).
  • Container Management: Komo.do (UI/tool for managing Docker containers and deployments).
  • Configuration Management (Host Setup): Ansible (to automate the preparation of new Ubuntu host servers).
  • Secure Internal Networking: Tailscale (to create a private, secure network connecting all host servers and potentially management nodes).
  • Shared Application Data Store: Google Drive accessed via rclone (mounted on hosts to provide a common, cloud-backed storage location for persistent Docker application data).
    • Initial thought mentioned Google Cloud Datastore - clarification needed if this is still considered or replaced by Google Drive/rclone.
  • Monitoring: Netdata (deployed as a Docker container for real-time host and container metrics).
  • Secure External Access / Reverse Proxy: Cloudflare Tunnel, automated via the custom DockFlare project. https://github.com/ChrispyBacon-dev/DockFlare
  • Version Control: Private GitHub Repository (to store and manage all control files, including Ansible playbooks, Docker configurations, scripts, etc.).
  • Basic Access: SSH (for direct command-line access to host servers when needed).

3. Workflow & Architecture Overview

  1. Provision Host: Start with a clean Server instance.
  2. Configure Host via Ansible: Run Ansible playbooks against the new host to:
    • Ensure SSH is configured securely.
    • Install Docker engine.
    • Install and configure the Tailscale client, joining it to the designated Tailnet.
    • Install rclone and configure it to access the shared Google Drive. Potentially set up systemd units to mount the drive automatically.
  3. Deploy Management & Core Services:
    • Use Komo.do (or manual Docker commands/compose initially) to deploy core infrastructure containers:
      • Netdata container for monitoring.
      • The DockFlare container/service.
  4. Deploy Applications:
    • Use Komo.do to deploy application containers.
    • Configure application containers requiring persistent or shared data to use volumes mapped to the rclone Google Drive mount point.
  5. Expose Services:
    • Configure the DockFlare service to automatically detect specified running containers and create/manage Cloudflare Tunnels to expose them securely to the internet (handling HTTPS and reverse proxying).
  6. Maintain Configuration: All Ansible playbooks, rclone setup scripts, potentially Docker Compose files (if used alongside Komo.do), and DockFlare configurations are stored and version-controlled in the private GitHub repository.

4. Key Implementations & Tools

  • Cloudflare Tunnel Automation: Utilize the custom project DockFlare available at:
    • https://github.com/ChrispyBacon-dev/DockFlare
    • Purpose: Automate the creation and management of Cloudflare Tunnels based on running Docker containers, simplifying secure public exposure.
  • Shared Storage Mechanism: Rely on rclone to bridge the gap between Docker hosts and Google Drive, providing a centralized location for data that needs to survive container/host restarts or replacements.

5. Benefits Achieved

  • Decoupling: Applications and their data are decoupled from individual hosts thanks to the shared rclone/Google Drive storage.
  • Simplified DR: To recover, provision a new host using Ansible, connect it to Tailscale, mount the rclone drive, and redeploy containers via Komo.do. Data remains intact on Google Drive.
  • Automation: Ansible automates host setup; DockFlare automates external access.
  • Security: Tailscale secures internal traffic; Cloudflare Tunnels secure external access without opening firewall ports.
  • Centralized Control: Configuration lives in GitHub. Komo.do provides a management interface.

On Secrets Keeping and Storing

Managing secrets (API keys, passwords, tokens, certificates, etc.) effectively requires careful planning. The goal is to prevent accidental exposure while making them accessible only to the authorized applications or services that need them, when they need them.

Core Principles:

  1. Never Commit Secrets to Version Control (GitHub): This is the absolute cardinal sin. Your private GitHub repo is private, but secrets don't belong there in plaintext. History is forever in Git.
  2. Encrypt Secrets At Rest: Secrets stored on disk, even temporarily, should be encrypted.
  3. Encrypt Secrets In Transit: Ensure secrets are transmitted over secure channels (HTTPS, SSH, VPNs like your Tailscale).
  4. Least Privilege: Grant access to secrets only to the entities (users, applications) that absolutely require them.
  5. Auditability: Be able to track who or what accessed secrets (more advanced, but good to keep in mind).
  6. Rotation: Have a plan to change (rotate) secrets regularly or when you suspect a compromise.

Strategies & Tools Relevant to Your Stack:

  1. Ansible Vault:

    • What it is: Ansible's built-in mechanism for encrypting sensitive data (variables, files) within your Ansible project.
    • How to use: You can encrypt entire variable files (ansible-vault encrypt vars/secrets.yml) or individual string variables within your playbooks/roles.
    • Pros: Integrated with Ansible, relatively easy to use, allows you to safely commit encrypted files (secrets.yml.vault) to your private GitHub repo.
    • Cons: Requires managing the vault password securely itself. Everyone who needs to run the playbook needs the password. Can become cumbersome with many secrets or complex access needs.
    • Recommendation: Use this for secrets needed during the Ansible provisioning phase (e.g., initial user passwords, keys needed for setup). Store the encrypted vault files in your private GitHub repo.
  2. Docker Secrets (for Docker Swarm):

    • What it is: Docker's native way to manage secrets for services running in Swarm mode. Secrets are mounted into containers as files in /run/secrets/.
    • Pros: Securely managed by Docker, not stored in container images or logs, encrypted in transit and at rest within the Swarm.
    • Cons: Only works with Docker Swarm mode. If you're using plain docker run or docker-compose (without Swarm), this isn't directly applicable. Komo.do might have its own way to handle secrets injection, check its documentation.
    • Recommendation: If you ever move to Docker Swarm, this is the preferred method. If not using Swarm, you need other approaches.
  3. Injecting Secrets via Environment Variables (Use with Caution):

    • How it works: Pass secrets to containers as environment variables (e.g., via docker run -e SECRET_KEY=..., docker-compose.yml environment section, or Komo.do's interface).
    • Pros: Very common, easy to implement initially.
    • Cons:
      • HIGH RISK OF EXPOSURE: Environment variables can often be inspected by other processes on the host (ps auxe), potentially logged by applications, or exposed via container inspection commands.
      • Secrets might end up in plaintext in docker-compose.yml files (which you should NOT commit to Git if they contain secrets).
      • Less secure than Docker Secrets or volume-mounted secret files.
    • Recommendation: Avoid if possible for highly sensitive secrets. If you must use them:
      • Inject them at runtime from a secure source (like Ansible reading from Vault, or a script reading from an encrypted file) rather than hardcoding them in committed files.
      • Use Komo.do's secret management features if it provides a secure way to inject environment variables without storing them plainly in its config.
      • Consider using .env files for docker-compose, add the .env file to your .gitignore, and manage the creation of the .env file securely (e.g., via Ansible Vault).
  4. Mounting Secret Files into Containers:

    • How it works: Store secrets in files on the host, then mount these files (or a directory containing them) into the container as volumes (e.g., docker run -v /path/on/host/secret.key:/path/in/container/secret.key:ro).
    • Pros: Keeps secrets out of environment variables and container images. Can restrict permissions on the host file.
    • Cons: Need to manage the creation and security of the secret files on the host before the container starts.
    • Recommendation: A viable alternative to environment variables for non-Swarm setups.
      • Use Ansible Vault to create/manage the encrypted secret files on the host within a secure directory (/etc/app-secrets/ perhaps, with tight permissions).
      • Ansible can decrypt the vault file during the playbook run and write the plaintext secret to a temporary location or directly to the final destination file on the host, ensuring the plaintext only exists when needed and on the target machine.
      • Mount these host files into your containers (ideally read-only :ro).
  5. Dedicated Secrets Management Tools:

    • Examples: HashiCorp Vault (very powerful, can be complex), Cloud provider secrets managers (GCP Secret Manager, AWS Secrets Manager), SOPS (Mozilla tool for encrypting YAML/JSON/env files using KMS, GPG, etc.).
    • Pros: Centralized, purpose-built, often offer features like dynamic secrets, rotation, auditing, fine-grained access control.
    • Cons: Adds another component to manage (potentially complex).
    • Recommendation: Likely overkill for your current setup unless your needs become very complex. SOPS combined with GPG or a cloud KMS could be a middle-ground if Ansible Vault feels too limited later.

Recommendations Project:

  1. Use Ansible Vault: Encrypt sensitive variables needed by Ansible itself (API keys for provisioning, initial setup credentials). Store these encrypted vault files safely in your private GitHub repo.
  2. Secure Komo.do Secrets: Check how Komo.do recommends managing secrets for the containers it deploys. Does it integrate with Docker Secrets (if using Swarm)? Does it have its own secure store? Can it read from files or environment variables? Prioritize its most secure method.
  3. Prefer File Mounts over Environment Variables: For secrets needed by your running Docker applications (database passwords, Cloudflare API tokens for DockFlare, etc.):
    • Use Ansible (with Vault) to place the plaintext secret into a file on the host during deployment (e.g., in /etc/docker-secrets/app-name/secret.txt). Ensure this file has strict permissions (e.g., readable only by root or a specific user).
    • Mount this file directly into the container using Docker volumes (-v /etc/docker-secrets/app-name/secret.txt:/run/secrets/my_secret:ro). Configure your application inside the container to read the secret from /run/secrets/my_secret.
  4. Strict .gitignore: Ensure your .gitignore file explicitly lists any potential plaintext secret files, .env files, Ansible Vault password files, etc.
  5. Scan for Secrets: Consider adding a pre-commit hook or using a tool like git-secrets or trufflehog to scan your codebase before commits to catch accidental secret exposures.

About

Dataverse Infrastructure as a Service - Notes & Concept

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published