Why This Job is Featured on The SaaS Jobs
This role stands out in the SaaS landscape because it sits at the infrastructure layer that increasingly differentiates AI-enabled products. Building distributed data systems that support large-scale multimodal training and evaluation reflects the operational reality of modern SaaS, where model iteration, data throughput, and platform reliability directly shape product velocity and customer outcomes.
For a long-term SaaS career, the work maps to durable platform competencies: designing pipelines that scale, hardening systems that become business-critical, and translating research-driven needs into production-grade services. Experience at this intersection tends to transfer across SaaS companies investing in AI features, internal data platforms, and MLOps, where reliability and cost-aware scalability become recurring themes as usage grows.
The position is best suited to engineers who prefer foundational work over surface-level feature delivery, and who take satisfaction in correctness, observability, and operational discipline. It also fits someone comfortable collaborating with research stakeholders while maintaining engineering standards, especially when requirements evolve and systems must remain dependable under changing workloads.
The section above is editorial commentary from The SaaS Jobs, provided to help SaaS professionals understand the role in a broader industry context.
Job Description
About the Team
The OpenAI Robotics team is focused on unlocking general-purpose robotics and pushing towards AGI-level intelligence in dynamic, real-world settings. Working across the entire model stack, we integrate cutting-edge hardware and software to explore a broad range of robotic form factors. We strive to seamlessly blend high-level AI capabilities with the constraints of physical systems to improve peoples’ lives.
About the Role
As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and evaluation at OpenAI. You’ll manage distributed data pipelines, collaborate closely with researchers to translate requirements into robust systems, and harden pipelines that serve as the backbone for OpenAI's rapid iteration cycles.
We’re looking for engineers who are detail-oriented, have strong experience with distributed systems, and excel at building reliable infrastructure in high-stakes environments.
This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.
In this role, you will:
Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure, machine learning infrastructure while ensuring scalability, reliability, and security.
Ensure our data platform can scale by orders of magnitude while remaining reliable and efficient.
Partner with researchers to deeply understand requirements and translate them into production-ready systems.
Harden, optimize, and maintain critical data infrastructure systems that power multimodal training and evaluation.
You might thrive in this role if you:
Have strong experience with distributed systems and large-scale infrastructure with a strong interest in data.
Are detail-oriented and bring rigor to building and maintaining reliable systems.
Demonstrate excellent software engineering fundamentals and organizational skills.
Are comfortable with ambiguity and rapid change.
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.
We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.
For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.
Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.
To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.
We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.
OpenAI Global Applicant Privacy Policy
At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.