Self-Management
The ArgoCD instance on the Infra Management Cluster is self-managing - it uses an ArgoCD Application to deploy and update its own configuration. This creates a powerful but carefully controlled GitOps loop.
Self-Referencing Architecture
graph TD
A["ArgoCD Manifest<br/>(Initial YAML)"] -->|1. Manual Deploy| B["ArgoCD Instance<br/>(Running in K8s)"]
B -->|2. Create| C["ArgoCD App<br/>(argocd.yaml)"]
C -->|3. References<br/>source.path| A
C -->|4. Manages| D["ArgoCD manages<br/>itself + other<br/>applications"]Bootstrap Process
- Initial Manual Deploy: ArgoCD is initially deployed manually using the HA manifest
- Application Creation: An ArgoCD
Applicationresource is created that points to the ArgoCD configuration in Git - Self-Reference: The Application’s
source.pathpoints tok8s/infra-services/argocd/overlays/infra-platform-cluster - Ongoing Management: From this point, ArgoCD manages its own configuration, including updates
The ArgoCD Application
The self-referencing Application is defined in k8s/infra-services/argocd/overlays/infra-platform-cluster/apps/argocd.yaml:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: argocd
labels:
cluster: 'infra-platform-mgmt'
environment: 'prod'
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
syncPolicy:
automated:
prune: false # Manual prune for safety
selfHeal: false # Manual heal for stability
destination:
namespace: argocd
server: https://kubernetes.default.svc
project: infra-services
source:
path: k8s/infra-services/argocd/overlays/infra-platform-cluster
repoURL: https://github.com/Titanbay/infra-services
targetRevision: 'main'
Conservative Sync Policy
The ArgoCD application uses deliberately conservative sync settings:
| Setting | Value | Reason |
|---|---|---|
automated.prune | false | Prevents accidental deletion of resources |
automated.selfHeal | false | Allows manual intervention for drift |
This approach prioritises stability over automation for the core infrastructure.
App-of-Apps Pattern
The ArgoCD Application is also an app-of-apps - it manages not only ArgoCD itself but also all other Applications defined in the apps/ and application-sets/ directories.
What Gets Managed
When ArgoCD syncs itself, it also syncs:
- ArgoCD core components (from the base HA manifest)
- ArgoCD configuration (ConfigMaps, Secrets, Ingress)
- All other Applications defined in
apps/ - All ApplicationSets in
application-sets/ - AppProjects in
projects/
This means a single sync of the argocd Application can bootstrap the entire infrastructure.
Updating ArgoCD
Upgrading the Version
To upgrade ArgoCD:
- Download the new HA manifest from the ArgoCD releases
- Add it to
k8s/infra-services/argocd/base/(e.g.,argocd-ha-3.3.0.yaml) - Update
base/kustomization.yamlto reference the new file - Commit and push to
main - ArgoCD will detect the change and show OutOfSync status
- Manually sync or wait for the next sync cycle
Configuration Changes
For configuration updates:
- Modify files in the overlay or base
- Commit and push to
main - ArgoCD auto-syncs the changes
Patch Files
The overlay uses Kustomize patches for customisation:
| Patch | Purpose |
|---|---|
argo-cd-cm.yaml | ConfigMap settings |
argocd-cmd-params-cm.yaml | Command parameters |
argocd-rbac-cm.yaml | RBAC policies |
argocd-server-resources.yaml | Server resource limits |
argocd-app-controller-resources.yaml | Controller resources |
argocd-repo-server-resources.yaml | Repo server resources |
dex-env-vars.yaml | Dex environment variables |
Safety Considerations
Cascade Delete Protection
The Application uses a finalizer:
finalizers:
- resources-finalizer.argocd.argoproj.io
Warning: Deleting the argocd Application from the cluster will trigger a cascading delete of all managed resources, including ArgoCD itself and all other Applications.
Recovery Procedure
If ArgoCD becomes unavailable:
- The HA manifest can be reapplied manually:
kubectl apply -f argocd-ha-3.2.1.yaml - The Application will resync from Git once ArgoCD is running
- All managed Applications will be restored
Notifications
The ArgoCD Application is configured with Slack notifications:
annotations:
notifications.argoproj.io/subscribe.on-app-synced.slack: platform-infra-notifications
notifications.argoproj.io/subscribe.on-app-outofsync.slack: platform-infra-notifications
notifications.argoproj.io/subscribe.on-app-sync-failed.slack: platform-infra-notifications
notifications.argoproj.io/subscribe.on-app-degraded.slack: platform-infra-notifications
This ensures the platform team is alerted to any issues with the core infrastructure.
Managing TB Platform ArgoCD Instances
The hub ArgoCD also manages the ArgoCD installations on tb-platform clusters via Helm. These are defined in k8s/infra-services/argocd/tb-platform/:
tb-platform/
├── base/
│ └── argocd-helm.yaml # Helm chart Application
└── overlays/
├── tb-platform-dev/ # Dev domain patch
├── tb-platform-qa/ # QA domain patch
└── tb-platform-prod/ # Prod domain patch
The hub creates Applications that deploy the ArgoCD Helm chart to each tb-platform cluster, with environment-specific domain configurations:
| Environment | ArgoCD Domain |
|---|---|
| Dev | argocd-dev.nessie-chimera.ts.net |
| QA | argocd-qa.nessie-chimera.ts.net |
| Prod | argocd-prod.nessie-chimera.ts.net |