Cluster Topology
This page describes the ArgoCD deployment architecture across Titanbay’s Kubernetes clusters.
Hub-and-Spoke Architecture
Titanbay operates a hub-and-spoke model where a central ArgoCD instance on the Infra Management Cluster manages deployments to all clusters, including itself and the tb-platform environment clusters.
The Four Clusters
| Cluster | Internal IP | Role |
|---|---|---|
| Infra Management | kubernetes.default.svc | Hub - runs primary ArgoCD |
| TB Platform Dev | 10.64.128.34 | Spoke - development environment |
| TB Platform QA | 10.64.128.50 | Spoke - QA/staging environment |
| TB Platform Prod | 10.64.128.66 | Spoke - production environment |
Infra Management Cluster
The infra management cluster runs the primary ArgoCD instance in High Availability (HA) mode. This ArgoCD instance is responsible for:
- Managing itself - The ArgoCD application syncs its own configuration
- Managing infrastructure services - Atlantis, Netbox, monitoring stack, etc.
- Managing tb-platform clusters - Via ApplicationSets that deploy to remote clusters
- Managing ArgoCD on tb-platform clusters - Each spoke cluster gets its own ArgoCD via Helm
ArgoCD HA Deployment
The management cluster runs ArgoCD using a standalone HA manifest (argocd-ha-3.2.1.yaml), not a Helm chart. This provides:
- Multiple replicas of critical components
- Redis HA for caching and state
- Pod Disruption Budgets for safe updates
- Custom resource limits tuned for production workloads
Applications Managed
The hub ArgoCD manages approximately 20+ applications on the infra management cluster itself:
| Application | Purpose |
|---|---|
argocd | Self-referencing - manages ArgoCD itself |
atlantis | Terraform PR automation |
dex | OIDC identity provider |
grafana-loki | Log aggregation |
grafana-alloy | Telemetry collection |
grafana-tempo | Distributed tracing |
netbox | Infrastructure documentation |
config-connector | GCP resource management |
envoy-gateway | API gateway |
tailscale-operator | VPN connectivity |
| … | And more |
TB Platform Clusters
Each tb-platform cluster (dev, qa, prod) runs its own ArgoCD instance deployed via Helm chart. These instances are managed by the hub ArgoCD, creating a cascading management structure.
How It Works
- Hub ArgoCD deploys an
Applicationthat references the Helm chart configuration for each spoke - The Helm chart installs ArgoCD on the tb-platform cluster
- The spoke ArgoCD manages workloads local to that cluster
ApplicationSets for TB Platform
The hub uses ApplicationSet resources to dynamically generate applications for each environment:
# Example: tb-platform-init ApplicationSet
generators:
- list:
elements:
- environment: dev
cluster: https://10.64.128.34
- environment: qa
cluster: https://10.64.128.50
- environment: prod
cluster: https://10.64.128.66
Key ApplicationSets include:
| ApplicationSet | Purpose |
|---|---|
tb-platform-init-resources | Bootstrap cluster resources (namespaces, RBAC) |
tb-platform-init-services | Deploy initial services |
tb-platform-environments | GCP Config Connector resources per environment |
tb-platform-external-secrets | External Secrets Operator per cluster |
tb-platform-grafana-alloy | Monitoring per cluster |
tb-platform-onepassword-operator | 1Password integration |
Spoke ArgoCD Configuration
The tb-platform ArgoCD instances are deployed via Helm with configuration in k8s/infra-services/argocd/tb-platform/:
- Base configuration (
base/argocd-helm.yaml): Common Helm values including OIDC SSO, RBAC policies - Environment overlays: Domain-specific patches for each environment
- Dev:
argocd-dev.nessie-chimera.ts.net - QA:
argocd-qa.nessie-chimera.ts.net - Prod:
argocd-prod.nessie-chimera.ts.net
- Dev:
Cross-Cluster Authentication
ArgoCD on the management cluster authenticates to remote clusters using:
- Cluster CA certificates stored as Secrets (e.g.,
tb-platform-cluster-ca.yaml) - Service Account tokens for Kubernetes API access
- Workload Identity for GCP resource access
Network Connectivity
All clusters communicate over private networking:
- Clusters are in the same GCP VPC or connected via VPC peering
- ArgoCD uses internal cluster IPs (10.64.x.x range)
- Tailscale provides secure access for human operators
Sync Strategies
Different sync strategies are applied based on the application type:
| Application Type | Auto Sync | Auto Prune | Self Heal |
|---|---|---|---|
| Infrastructure (argocd) | Yes | No | No |
| Platform services | Yes | Yes | - |
| Config Connector resources | Yes | - | - |
The conservative approach for the argocd application itself prevents accidental destructive changes.