Performance Modeling Lead

OpenAI • Full-time • Remote (San Francisco, California, United States) • $342k - $555k / year • 1m ago

Why This Job is Featured on The SaaS Jobs

This Performance Modeling Lead role stands out in the SaaS ecosystem because it targets the infrastructure layer that increasingly underpins AI-enabled software delivery. Rather than focusing on application features, the remit is about quantifying system tradeoffs across compute, memory, networking, and storage, then turning that analysis into architectural direction. The scope spans from workload behavior through to vendor-facing reference designs, which is a meaningful vantage point for how modern SaaS platforms achieve reliability and efficiency at scale.

For a long-term SaaS career, the value lies in building a repeatable decision framework for performance and cost-sensitive infrastructure choices, a capability that transfers across high-scale SaaS, AI product companies, and platform organizations. Owning a modeling toolchain and validating it against real measurements develops the kind of operationally grounded rigor that informs capacity planning, roadmap prioritization, and cross-team alignment. Leading a small team also adds a management dimension without losing technical depth.

This role is best suited to professionals who prefer ambiguous, forward-looking problems and can translate complex quantitative work into clear guidance for diverse stakeholders. It aligns with candidates who enjoy working across abstraction layers and collaborating with external partners while maintaining strong analytical ownership.

The section above is editorial commentary from The SaaS Jobs, provided to help SaaS professionals understand the role in a broader industry context.

Job Description

About the Team

OpenAI’s Hardware organization develops system and infrastructure solutions designed for the unique demands of advanced AI workloads. We work closely with research, software, and external hardware partners to shape the next generation of AI systems, from silicon through full-scale deployments.

Our team focuses on understanding and optimizing performance across the full system stack—ensuring that architectural decisions are grounded in rigorous, quantitative analysis of real-world workloads.

About the Role

We are seeking a Performance Modeling Lead to build and lead a small, high-impact team responsible for answering forward-looking architectural questions across AI infrastructure systems.

You will develop modeling frameworks and methodologies to evaluate system-level tradeoffs and guide key design decisions. Your work will directly influence reference architectures, vendor designs, and long-term infrastructure strategy.

This role sits at the intersection of AI workloads, system architecture, and quantitative modeling, and requires strong technical judgment, ownership, and the ability to translate complex analysis into clear, actionable guidance.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance.

Key Responsibilities

Build and own a performance modeling framework/toolchain to evaluate AI systems across multiple levels of abstraction.
Analyze and quantify architectural tradeoffs across compute, memory, networking, storage, and system topology.
Develop performance models to guide decisions on:
- scale-up vs. scale-out architectures
- interconnect and network design
- memory hierarchy and system balance.
Translate modeling outputs into clear recommendations for internal teams and external hardware vendors.
Influence reference designs and vendor roadmaps through data-driven insights.
Partner closely with machine learning, systems, and hardware teams to understand workload characteristics and requirements.
Lead and grow a small team (2–3 engineers), setting technical direction and maintaining high standards for modeling rigor.
Continuously improve modeling fidelity by validating against real system behavior and measurements.

Qualifications

Have experience owning or building performance modeling frameworks used to drive real system design decisions.
Have deep knowledge of AI/ML workloads, including training and/or inference at scale.
Understand system-level tradeoffs across compute, memory, and networking in large-scale distributed systems.
Are comfortable working across abstraction layers—from workload behavior to hardware implementation.
Have experience using modeling (analytical or simulation) to inform architectural decisions.
Can operate in ambiguous problem spaces and turn open-ended questions into structured analysis.
Communicate clearly and influence both internal teams and external partners.

Preferred Skills

Experience working with hardware vendors (ODM/JDM, silicon, networking).
Background in data center infrastructure or hyperscale systems.
Familiarity with accelerators (GPUs/ASICs) and interconnects (e.g., NVLink, InfiniBand, Ethernet).
Experience influencing hardware roadmaps or reference architectures.
Prior experience leading or mentoring engineers.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.

Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.