Why This Job is Featured on The SaaS Jobs
This role sits at a pivotal intersection in modern SaaS: delivering generative AI capabilities inside a data platform where efficiency, latency, and reliability directly shape product viability. Work on LLM training and inference systems, GPU kernels, and agentic frameworks reflects how leading SaaS companies are moving from “AI features” to AI as core infrastructure, with cost and performance becoming first-order product constraints.
From a SaaS career perspective, the emphasis on profiling, benchmarking, and resource utilization builds durable systems expertise that transfers across cloud-native organizations. Optimizing model serving and training pipelines develops a practical understanding of how research ideas become production-grade capabilities, including the tradeoffs between throughput, responsiveness, and operational cost. The expectation to publish and open-source also signals experience in technical communication that can compound influence across the broader SaaS ecosystem.
This position is best suited to engineers who prefer deep, measurable work on performance bottlenecks and enjoy collaborating across research and engineering. It fits professionals who want their impact to be visible in platform-level metrics and who are motivated by rigorous experimentation rather than feature delivery alone. Remote flexibility also supports candidates who operate effectively with written design and results-driven iteration.
The section above is editorial commentary from The SaaS Jobs, provided to help SaaS professionals understand the role in a broader industry context.
Job Description
At Snowflake, we are powering the era of the agentic enterprise. To usher in this new era, we seek AI-native thinkers across every function who are energized by the opportunity to reinvent how they work. You don’t just use tools; you possess an innate curiosity, treating AI as a high-trust collaborator that is core to how you solve problems and accelerate your impact. We look for low-ego individuals who thrive in dynamic and fast-moving environments and move with an experimental mindset — who rapidly test emerging capabilities to discover simpler, more powerful ways to deliver results. At Snowflake, your role isn't just to execute a function, but to help redefine the future of how work gets done.
We are looking for talented System Developers and Researchers to join the Snowflake AI Research team and contribute to LLM inference and training system development, optimizations, and agentic systems. Our mission is to build the most efficient and scalable generative AI systems.
Recent releases from our team include SwiftKV, an advanced inference optimization, and Arctic LLM, one of the largest open-source MoE foundation models. This is an exciting opportunity to collaborate with a world-class team, including founding members of DeepSpeed, vLLM, and TensorFlow. Together, we will push the boundaries of deep learning systems and drive cutting-edge innovations in AI.
Responsibilities:
Analyze and optimize GPU kernel performance for training and inference of LLMs.
Develop and implement strategies to enhance the efficiency and scalability of deep learning systems.
Profile and benchmark deep learning systems using tools and techniques to identify bottlenecks.
Design and implement optimizations to reduce latency and improve resource utilization for training and inference.
Stay updated with the latest advancements in GPU kernel optimization, deep learning, and LLM system development.
Contribute to the development of agentic frameworks and applications for LLM-driven workflows, enhancing automation, reasoning, and decision-making capabilities.
Open-source and publish innovations, optimizations, and engineering practices in technical blogs, top-tier conferences and journals.
Requirements:
Bachelor’s degree in Computer Science, Electrical Engineering, or a related field. A Master’s degree or PhD is preferred.
5 years of experience in GPU kernel optimization, deep learning system optimization, or high-performance computing (HPC).
Proficiency in deep learning frameworks such as PyTorch, TensorFlow, JAX.
Strong understanding of GPU architectures and experience with CUDA or similar frameworks.
Experience with frameworks like CUTLASS, Triton, cuDNN, etc.
Experience with profiling tools (e.g., nvprof, Nsight) and performance analysis methodologies.
Solid problem-solving skills and ability to debug complex performance issues.
Excellent communication skills and ability to work effectively in a cross-functional team environment.
Join us in optimizing deep learning systems and pushing the boundaries of AI efficiency. Apply now to be part of our dynamic and pioneering team!
Every Snowflake employee is expected to follow the company’s confidentiality and security standards for handling sensitive data. Snowflake employees must abide by the company’s data security plan as an essential part of their duties. It is every employee's duty to keep customer information secure and confidential.
Snowflake is growing fast, and we’re scaling our team to help enable and accelerate our growth. We are looking for people who share our values, challenge ordinary thinking, and push the pace of innovation while building a future for themselves and Snowflake.
How do you want to make your impact?
For jobs located in the United States, please visit the job posting on the Snowflake Careers Site for salary and benefits information: careers.snowflake.com