Senior Platform Engineer (Cloud Platform)

Amplitude • San Francisco, CA • 1m ago

Why This Job is Featured on The SaaS Jobs

This Senior Platform Engineer role sits at a core leverage point in a SaaS organization: the internal platform that determines how quickly and safely product engineering can ship. The remit spans Kubernetes, AWS, GitOps, and CI/CD, with an explicit angle on adapting platform foundations for AI-assisted development where automated agents participate in deploys and infrastructure changes.

For a long-term SaaS career, this kind of platform scope builds durable instincts around multi-tenant reliability, secure-by-default infrastructure, and developer experience as a product. Ownership of Infrastructure-as-Code, observability, and SLO-driven operations maps directly to how modern SaaS companies manage uptime and change velocity at scale. The emphasis on self-service automation also translates well across SaaS environments that are standardizing internal tooling to reduce cognitive load and operational overhead.

The role is best suited to an engineer who prefers end-to-end systems work, from design through incident learnings and iterative hardening. It will fit someone comfortable partnering with product teams and Staff-level peers, and motivated by creating guardrails that enable others rather than building application features. A pragmatic approach to measuring impact and reducing toil signals strong alignment with platform engineering in SaaS.

The section above is editorial commentary from The SaaS Jobs, provided to help SaaS professionals understand the role in a broader industry context.

Job Description

About the Role

Amplitude's Cloud Platform team builds the systems that every Amplitude engineer relies on every day to ship code — and we're rebuilding them for the AI era. As a Senior Platform Engineer, you'll own medium-to-high-complexity platform projects end-to-end and help shape a platform where AI agents are first-class users alongside humans: kicking off deploys, opening pull requests against infrastructure, and triaging incidents, so a single engineer can get the throughput of a team.

You'll partner with Staff engineers and product teams to make Kubernetes effortless across the engineering org, building self-service automation and scalable AWS infrastructure that lets product teams ship faster, safer, and with less cognitive load. If you're excited about building the systems that other engineers will rely on every day, this role is for you.

Key Responsibilities

Lead high-impact platform projects — design and ship capabilities that move the needle on developer experience, reliability, or security, and set the bar for quality, testing, and safe deployment practices.
Build the AI-augmented platform. Design tooling and workflows that help engineers get more out of AI-assisted development — think infra primitives that are easy to reason about, automated review, and policy-as-code that keeps the guardrails strong as AI shifts how code gets written.
Own Infrastructure-as-Code for Kubernetes, AWS, and GCP using Terraform, Helm, Kustomize, and emerging tooling — and make it consumable enough that an LLM can safely PR against it.
Evolve our CI/CD backbone (Argo CD / Workflows / Rollouts, GitHub Actions) to make deploys faster, safer, and easier to reason about.
Instrument and operate. Drive observability with Datadog and Amplitude, own dashboards and SLOs, and use the data to push reliability forward.
Participate in on-call, lead incident response when needed, and turn postmortems into durable platform improvements.
Reduce toil and tech debt with pragmatic remediation plans that account for future requirements and operational cost.
Translate product needs into platform features by partnering closely with product engineering teams.
Mentor and multiply. Coach junior and mid-level engineers, support their growth, and onboard new teammates — including helping the team get more leverage out of AI-assisted development.
Shape the roadmap by spotting high-leverage opportunities and scoping them to maximize impact.

What We're Looking For

5+ years of experience in software engineering, DevOps, or Site Reliability Engineering, with serious hands-on time in cloud infrastructure.
Education: B.S. in Computer Science or an equivalent technical field.
Production experience operating Kubernetes (EKS, GKE, AKS, or on-prem) and containerized applications at meaningful scale.
Proficiency in at least one programming language (Golang or Python preferred) and IaC tooling (Terraform).
Working knowledge of AWS core services (EC2, EKS, IAM, VPC, ALB, S3) and networking/security fundamentals.
Familiarity with GitOps workflows and the CNCF ecosystem (Argo, Helm, Backstage, Envoy, and friends).
A track record of delivering projects that measurably improved reliability, performance, or developer productivity.
Curiosity and conviction about AI as a force multiplier in infrastructure work — whether that's using AI-assisted development tools to ship faster or building platforms and tooling that help your teammates get more leverage from AI in their day-to-day work.
Strong communicator — you can break down complex topics for varied audiences and default to collaborative problem solving.
Comfort navigating ambiguity by validating assumptions early and using data to de-risk technical decisions.
A pragmatic, business-aligned engineering mindset and a habit of continuous learning.