About the Role
Amplitude's Cloud Platform team builds the systems that every Amplitude engineer relies on every day to ship code — and we're rebuilding them for the AI era. As a Senior Platform Engineer, you'll own medium-to-high-complexity platform projects end-to-end and help shape a platform where AI agents are first-class users alongside humans: kicking off deploys, opening pull requests against infrastructure, and triaging incidents, so a single engineer can get the throughput of a team.
You'll partner with Staff engineers and product teams to make Kubernetes effortless across the engineering org, building self-service automation and scalable AWS infrastructure that lets product teams ship faster, safer, and with less cognitive load. If you're excited about building the systems that other engineers will rely on every day, this role is for you.
Key Responsibilities
- Lead high-impact platform projects — design and ship capabilities that move the needle on developer experience, reliability, or security, and set the bar for quality, testing, and safe deployment practices.
- Build the AI-augmented platform. Design tooling and workflows that help engineers get more out of AI-assisted development — think infra primitives that are easy to reason about, automated review, and policy-as-code that keeps the guardrails strong as AI shifts how code gets written.
- Own Infrastructure-as-Code for Kubernetes, AWS, and GCP using Terraform, Helm, Kustomize, and emerging tooling — and make it consumable enough that an LLM can safely PR against it.
- Evolve our CI/CD backbone (Argo CD / Workflows / Rollouts, GitHub Actions) to make deploys faster, safer, and easier to reason about.
- Instrument and operate. Drive observability with Datadog and Amplitude, own dashboards and SLOs, and use the data to push reliability forward.
- Participate in on-call, lead incident response when needed, and turn postmortems into durable platform improvements.
- Reduce toil and tech debt with pragmatic remediation plans that account for future requirements and operational cost.
- Translate product needs into platform features by partnering closely with product engineering teams.
- Mentor and multiply. Coach junior and mid-level engineers, support their growth, and onboard new teammates — including helping the team get more leverage out of AI-assisted development.
- Shape the roadmap by spotting high-leverage opportunities and scoping them to maximize impact.
What We're Looking For
- 5+ years of experience in software engineering, DevOps, or Site Reliability Engineering, with serious hands-on time in cloud infrastructure.
- Education: B.S. in Computer Science or an equivalent technical field.
- Production experience operating Kubernetes (EKS, GKE, AKS, or on-prem) and containerized applications at meaningful scale.
- Proficiency in at least one programming language (Golang or Python preferred) and IaC tooling (Terraform).
- Working knowledge of AWS core services (EC2, EKS, IAM, VPC, ALB, S3) and networking/security fundamentals.
- Familiarity with GitOps workflows and the CNCF ecosystem (Argo, Helm, Backstage, Envoy, and friends).
- A track record of delivering projects that measurably improved reliability, performance, or developer productivity.
- Curiosity and conviction about AI as a force multiplier in infrastructure work — whether that's using AI-assisted development tools to ship faster or building platforms and tooling that help your teammates get more leverage from AI in their day-to-day work.
- Strong communicator — you can break down complex topics for varied audiences and default to collaborative problem solving.
- Comfort navigating ambiguity by validating assumptions early and using data to de-risk technical decisions.
- A pragmatic, business-aligned engineering mindset and a habit of continuous learning.