Senior Director, SRE & Cloud Infrastructure

Cyberhaven • Full-time • USA • $250k - $300k / year • 1m ago

Why This Job is Featured on The SaaS Jobs

This Senior Director remit sits at the intersection of two defining SaaS concerns: reliability at scale and cloud economics. For a data security platform, availability, latency, and resilience are not abstract engineering goals but core product expectations, and the role’s scope spans the production runtime as well as the internal platforms that enable delivery. The emphasis on Kubernetes, CI/CD, and automation signals a mature cloud native operating model rather than a lift and shift infrastructure function.

The career upside in SaaS terms is breadth across the full operating system of a subscription business. Ownership of SLOs, incident practices, and error budgets builds durable SRE leadership skills, while direct responsibility for capacity planning and COGS introduces the commercial discipline increasingly expected of infrastructure executives. Close partnership with Product, Security, and Finance also develops the cross functional fluency needed to translate technical tradeoffs into customer and business outcomes.

This role fits a leader comfortable moving between strategy and hands on technical judgment, including post incident learning and architectural decision making. It will suit someone who enjoys building repeatable operating mechanisms, mentoring managers and senior ICs, and using metrics to guide prioritisation. The presence of globally distributed teams suggests an aptitude for clear communication and scalable management practices.

The section above is editorial commentary from The SaaS Jobs, provided to help SaaS professionals understand the role in a broader industry context.

Job Description

About the Role

As the Senior Director of SRE & Cloud Infrastructure, you will lead the teams responsible for the reliability, scalability, and cost-efficiency of our data security platform. You will own the infrastructure and operational foundations that power our engineering organization and customer-facing products, operating at massive scale with rigorous performance, reliability, and fault-tolerance capabilities.

You’ll set strategy, grow and mentor teams, and still dive deep into architecture, incidents, and hard technical decisions. You’ll partner closely with Engineering, Product, Security, and Finance leadership to scale our infrastructure sustainably, manage COGS, and continuously improve developer and operational experience.

You’ll play a key role in shaping our engineering culture, operational rigor, and AI-driven approach to reliability and efficiency as we scale. This role will report to the SVP of Engineering.

What You'll Do

Lead, grow, and mentor high-performing globally distributed SRE and Infrastructure teams, including managers and senior ICs
Own the reliability, availability, scalability, and performance of our production and developer platforms
Define and execute the SRE and infrastructure strategy, including cloud architecture, Kubernetes platforms, CI/CD, and automation
Drive horizontal scaling and enable teams to operate independently, through decoupling and modularization of both architecture and processes
Drive infrastructure cost (COGS) optimization, capacity planning, and cloud financial management in close partnership with Finance and Engineering leadership
Establish and evolve SLOs, SLIs, error budgets, and operational best practices across the organization
Oversee incident management, postmortems, and continuous improvement, ensuring a strong culture of learning and ownership
Collaborate closely with security to ensure our infrastructure is secure, compliant, and resilient by design
Contribute to and uphold strong documentation, operational standards, and knowledge sharing across teams

Who You Are

You’ve led SRE and Infrastructure organizations at high-growth SaaS, platform, or security companies
You are a strong technical leader with deep experience in cloud-native systems and a strong SRE mindset
You have a strong background in Kubernetes, cloud platforms (GCP and/or AWS), and infrastructure as code (Terraform or equivalent)
You’ve designed or operated large-scale distributed systems, real-time data pipelines, or high-throughput platforms
You have experience owning COGS, cloud spend, and efficiency metrics, and can clearly communicate tradeoffs to executives
You’re comfortable operating at multiple levels: strategic planning, architectural reviews, and deep technical problem solving
You use data and metrics to drive reliability, performance, cost optimization, and team productivity
You have a proven track record of scaling teams and systems while maintaining high reliability and velocity
You’re an empathetic leader who fosters inclusion, ownership, accountability, and psychological safety
You thrive in fast-moving environments and are comfortable navigating ambiguity and change

Joining Cyberhaven is a chance to revolutionize data security. Traditional tools fall short, but we’ve reimagined protection with AI-enabled data lineage that analyzes billions of workflows to understand data, detect risk, and stop threats. Backed by $250M from leading investors like Khosla and Redpoint, our team includes leaders who built industry-defining technologies at CrowdStrike, Palo Alto Networks, Meta, Google, and more. This role lets you shape the future of data security, alongside experts driven to help customers protect their most valuable information.

Cyberhaven is committed to creating a diverse environment and is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status.