Why This Job is Featured on The SaaS Jobs
This Site Reliability Engineer role sits at the infrastructure layer that underpins modern SaaS: keeping an API-driven, cloud-hosted platform dependable while it serves customers across multiple regulated industries. In a privacy-vault product, availability and correctness are tightly coupled with trust, which makes operational rigor and automation especially meaningful in the day-to-day work.
From a SaaS career perspective, the position offers exposure to the playbook most subscription platforms rely on as they mature: codified infrastructure (IaC), repeatable delivery via CI/CD, container orchestration, and observability that turns incidents into measurable reliability improvements. Experience balancing scalability, security, and performance in production becomes highly transferable across B2B SaaS—particularly for engineers who want to grow into platform engineering, reliability leadership, or customer-facing technical ownership.
The listing signals a fit for an engineer who enjoys building systems, not just running them: writing Go or Python, standardising operational patterns, and collaborating across distributed teams. The PST-aligned coverage also suits someone comfortable owning production outcomes across time zones while helping shape SRE practices rather than following an established template.
The section above is editorial commentary from The SaaS Jobs, provided to help SaaS professionals understand the role in a broader industry context.
Job Description
Employment Type
Full time
Department
EngineeringSite Reliability
Skyflow is a data privacy vault company built to radically simplify how companies isolate, protect, and govern their customers’ most sensitive data. With its global network of data privacy vaults, Skyflow is also a comprehensive solution for companies around the world looking to meet complex data localization requirements. Skyflow currently supports a diverse customer base that spans verticals like fintech, retail, travel, and healthtech.
Skyflow is headquartered in Palo Alto, California and was founded in 2019. For more information, visit www.skyflow.com or follow on X and LinkedIn.
About the role:
As a Site Reliability Engineer, you will be responsible for driving the effort to identify, design, and develop the best technical and field solutions to automate our production systems. This position will collaborate often with various internal and external business and engineering teams. You will also have an opportunity to lead efforts to champion and instill a culture of Site Reliability Engineering at Skyflow.
We know great Site Reliability Engineers come from diverse technical backgrounds, so no single individual may have all the desired skills on day one. But if you are the kind of software engineer who would have loved to engineer solutions for Stripe or Twilio API's, or the Slack or Zendesk app, or the Snowflake or MongoDB platform - we want to talk to you
You have:
2+ years in infrastructure automation and SRE-driven software delivery.
Strong experience with at least one major cloud platform (AWS preferred; Azure or GCP acceptable).
Programming experience in Go (preferred) or Python.
Hands-on experience with:
Infrastructure as Code: Terraform, CloudFormation
CI/CD tools: Jenkins
Configuration management: Ansible
Containers & orchestration: Docker, Kubernetes
Linux systems engineering and scripting for automation
RDBMS and production-grade infrastructure
Deep understanding of site reliability practices, observability patterns, and operational excellence.
Experience working with large-scale, distributed infrastructures.
Ability to collaborate with distributed global teams, providing technical guidance and leadership.
You will:
Provide operational support aligned with US time zones, ensuring system reliability and availability.
Design, build, and maintain highly available and scalable cloud infrastructure using AWS and modern SRE practices.
Develop and maintain CI/CD pipelines for automated testing, building, and deployment of applications.
Automate infrastructure provisioning, configuration, and deployment using Terraform, CloudFormation, Helm, and Ansible.
Work extensively with Docker and Kubernetes for container orchestration and service management.
Implement and maintain observability solutions including monitoring, logging, alerting, and tracing.
Evaluate and improve reliability, performance, scalability, and security of production systems.
Support migration and modernization initiatives, choosing the right approach based on prior experience.
Collaborate with cross-functional teams and clients to deliver robust, cloud-based solutions and best-in-class customer experiences.
Act as a thought leader within the SRE team, contributing to processes, standards, and the overall SRE culture.
Benefits:
At Skyflow, we believe that diverse teams are the strongest teams. We invite applicants of all genders, races, ethnicities, nationalities, ages, religions, sexual orientations, disability statuses, educational experiences, family situations, and socio-economic backgrounds.