We are seeking a Senior Site Reliability Engineer 3 to join our Release Engineering team in our Lisbon office. As a senior member of our growing tech hub in Portugal, you will drive platform engineering initiatives, lead technical decisions, and mentor team members while building robust, scalable infrastructure solutions that enhance our developer experience and platform reliability.
Key Responsibilities
- Lead the design and implementation of complex platform engineering solutions
- Drive architectural decisions for our CI/CD infrastructure and Kubernetes platform
- Mentor junior team members and provide technical leadership in platform engineering practices
- Develop and implement strategic initiatives to improve developer experience and platform reliability
- Design and implement scalable solutions for infrastructure automation using Terraform and other IaC tools
- Lead post incident reviews and drive systematic improvements to prevent recurring issues
- Collaborate with other engineering teams globally to define and implement platform standards
- Champion observability and monitoring best practices across the organization
- Participate in a 24/7 on-call rotation. And yes, we use PagerDuty to manage our on-call schedules
Basic Qualifications
- 5+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering roles
- Deep expertise in Kubernetes administration and architecture
- Strong track record of leading CI/CD and platform engineering initiatives
- Demonstrated experience leading technical projects and mentoring engineers
- Advanced experience working on cloud-native infrastructure (e.g. AWS, GCP, Azure)
- Experience with monitoring, observability and logging platforms (e.g. DataDog, New Relic, SumoLogic, Splunk)
- Advanced experience with Infrastructure as Code, (e.g. Terraform, Cloudformation)
- Proficiency in at least one programming language (e.g. Python, Ruby, Go, etc.)
Preferred Qualifications
- Experience with GitOps practices and tools like ArgoCD
- Experience building and maintaining platform engineering solutions at scale
- Experience implementing and managing observability solutions
- Experience with cost optimization and capacity planning
- Knowledge of emerging trends in platform engineering and DevOps practices
- Strong technical writing skills for documentation and knowledge sharing
- Experience with developer portals and internal platform products
PagerDuty is a flexible, hybrid workplace. We embrace and encourage in-person working as an integral part of our culture. Both our employees and external research tells us that co-located collaboration strengthens connections, drives innovation, and accelerates learning.
This role is expected to come into our Lisbon office 1 day per month, so you can thrive in your new role and fully embrace being a Dutonian!