We are seeking a Senior Site Reliability Engineer 4 to join our Release Engineering team. As a senior member of our Release Engineering, you will drive our release and deployment developer experience, observability, and platform engineering initiatives to enable teams across PagerDuty engineering. In this role you will lead technical decisions, and mentor team members while building robust, scalable infrastructure solutions that enhance our developer experience and platform reliability.
Key Responsibilities
- Lead the design and implementation of complex platform engineering solutions
- Drive architectural decisions for our CI/CD infrastructure and Kubernetes platform
- Mentor junior team members and provide technical leadership in platform engineering practices
- Develop and implement strategic initiatives to improve developer experience and platform reliability
- Design and implement scalable solutions for infrastructure automation using Terraform and other IaC tools
- Lead post incident reviews and drive systematic improvements to prevent recurring issues
- Collaborate with other engineering teams globally to define and implement platform standards
- Champion observability and monitoring best practices across the organization
- Participate in a 24/7 on-call rotation. And yes, we use PagerDuty to manage our on-call schedules
Basic Qualifications
- 8+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering roles
- Deep expertise in Kubernetes administration and architecture
- Strong track record of leading CI/CD and platform engineering initiatives
- Demonstrated experience leading technical projects and mentoring engineers
- Advanced experience working on cloud-native infrastructure (e.g. AWS, GCP, Azure)
- Experience with monitoring, observability and logging platforms (e.g. DataDog, New Relic, SumoLogic, Splunk, Grafana)
- Advanced experience with Infrastructure as Code, (e.g. Terraform, Cloudformation)
- Proficiency in at least one programming language (e.g. Python, Ruby, Go, etc.)
Preferred Qualifications
- Experience with GitOps practices and tools like ArgoCD
- Experience building and maintaining platform engineering solutions at scale
- Experience implementing and managing observability solutions
- Experience with cost optimization and capacity planning
- Knowledge of emerging trends in platform engineering and DevOps practices
- Strong technical writing skills for documentation and knowledge sharing
- Experience with developer portals and internal platform products
PagerDuty is a flexible, hybrid workplace. We embrace and encourage in-person working as an integral part of our culture. Both our employees and external research tells us that co-located collaboration strengthens connections, drives innovation, and accelerates learning.
This role is expected to come into our Toronto office 1 day per month, so you can thrive in your new role and fully embrace being a Dutonian!
The base salary range for this position is 137,000 - 207,000 CAD. This role may also be eligible for bonus, commission, equity, and/or benefits.
Our base salary ranges are determined by role, level, and location. The range, which is subject to change based on primary work location, reflects the minimum and maximum base salary we expect to pay newly hired employees for the position. Within the range, we determine pay for an individual based on a number of factors including market location, job-related knowledge, skills/competencies and experience.
Your recruiter can share more about the specific offerings for this role, as well as the salary range for your primary work location during the hiring process.