Selected work
Case studies
A few representative pieces of work: identity, reliability, observability, and cost, at globally distributed scale.
Security & Identity
Secrets and workload identity at platform scale
A globally distributed, 10,000-server platform needed machine identity without static secrets, and one trust model across CI/CD and every service.
My team owned and operated self-hosted HashiCorp Vault on Kubernetes. We built zero-trust authentication for CI/CD pipelines and workloads (Kubernetes auth, no static secrets), enforced RBAC and least-privilege, and expanded Vault-based patterns across developer tooling and platform access control.
Impact: A single, auditable identity layer for pipelines and workloads, replacing scattered static credentials and carried across years of platform evolution.
Reliability
Rebuilding incident response and reliability
SLA credit costs were high and incidents took too long to resolve across time zones.
Rebuilt incident-response workflows, expanded observability coverage, and redesigned on-call rotations to match engineer availability across time zones, hiring deliberately into under-covered windows.
Impact: Eliminated over $1M/year in SLA credit costs, cut MTTR from about 30 minutes to under 5, and reduced P1/P2 incident frequency by 20%, all while holding 99.995% availability.
Observability
Org-wide observability migration, zero downtime
SignalFX was the exclusive metrics platform and had to be replaced without losing data or visibility.
Led a year-long migration to Prometheus, Grafana, and OpenTelemetry across 50+ engineers. Ran parallel dual-write validation for data consistency, trained the whole org, and authored DR plans and degradation levels that leadership reviewed and accepted before the old stack was decommissioned.
Impact: $700k/year saved, full observability preserved, and 99.995% availability held throughout.
Cloud & FinOps
Cutting cloud spend by tens of millions
Compute spend was climbing on a large, globally distributed estate.
Led the EC2 to Kubernetes/EKS migration, a Graviton rearchitecture, and vendor contract renegotiations, with cost attribution and governance to keep the savings durable.
Impact: $1.8M/year (EKS), $3.6M/year (Graviton), and $2M/year (vendor renegotiations), part of ~$9.6M+/year in confirmed savings.