Why This Job is Featured on The SaaS Jobs
This Senior Machine Learning Engineer role sits at the intersection of SaaS observability and applied AI, where product value is created by turning high-volume operational telemetry into decisions. Work centered on heterogeneous log streams and real-time understanding reflects a mature SaaS problem space: reliability and security outcomes depend on scalable data foundations and trustworthy automation.
For a long-term SaaS career, the emphasis on agentic systems, evaluation, and deployment reliability builds experience that transfers across modern cloud products. The remit touches the full loop that many SaaS companies struggle to industrialize: designing LLM-driven components, integrating them with data and infrastructure layers, and then instrumenting them so they can be monitored and improved in production. Exposure to context engineering, memory management, and observability of AI behavior aligns with emerging patterns in LLM Ops and AI platform work.
The role is best suited to an engineer who prefers ownership across ambiguous problem boundaries and can collaborate tightly with product and infrastructure counterparts. It will fit someone who enjoys pairing research-informed methods with pragmatic engineering, and who is motivated by measurable system performance, testing discipline, and operational feedback rather than model experimentation in isolation.
The section above is editorial commentary from The SaaS Jobs, provided to help SaaS professionals understand the role in a broader industry context.
Job Description
Senior Machine Learning Engineer
Location: (Bangalore or Noida)
The proliferation of machine log data has the potential to give organizations unprecedented real-time visibility into their infrastructure and operations. With this opportunity comes tremendous technical challenges around ingesting, managing, and understanding high-volume streams of heterogeneous data
As a Machine Learning Engineer, you’ll build the intelligence behind the next generation of agentic AI systems that reason over massive, heterogeneous log data. You’ll combine machine learning, prompt engineering, and rigorous evaluation to create autonomous AI agents that help organizations understand and act on their data in real time.
You’ll be part of a small, high-impact team shaping how AI agents understand complex machine data. This is an opportunity to work on cutting-edge LLM infrastructure and contribute to defining best practices in context engineering and AI observability.
Responsibilities
- Design, implement, and optimize agentic AI components, including context engineering, memory management, and prompts.
- Collaborate cross-functionally with product, data, and infrastructure teams to deliver end-to-end AI-powered insights.
- Operate autonomously in a fast-paced, ambiguous environment - defining scope, setting milestones, and driving outcomes.
- Ensure reliability, performance, and observability of deployed agents through rigorous testing and continuous improvement.
- Maintain a strong bias for action—delivering incremental, well-tested improvements that directly enhance customer experience.
Required Qualifications
- B.Tech, M.Tech, or Ph.D. in Computer Science, Data Science, or a related field.
- 4-6 years of hands-on industry experience with demonstrable ownership and delivery.
- Strong understanding of machine learning fundamentals, data pipelines, and model evaluation.
- Proficiency in Python and ML/data libraries (NumPy, pandas, scikit-learn); familiarity with JVM languages is a plus.
- Working knowledge of LLM core concepts, prompt design, and agentic design patterns.
- Strong communication skills and a passion for shaping emerging AI paradigms.
Desired Qualifications
- Prior experience building and deploying AI agents or LLM applications in production.
- Familiarity with modern agentic AI frameworks (e.g., LangGraph, LangChain, CrewAI).
- Experience with ML infrastructure and tooling (PyTorch, MLflow, Airflow, Docker, AWS).
- Exposure to LLM Ops - infrastructure optimization, observability, latency, and cost monitoring.
About Us
Sumo Logic, Inc. helps make the digital world secure, fast, and reliable by unifying critical security and operational data through its Intelligent Operations Platform. Built to address the increasing complexity of modern cybersecurity and cloud operations challenges, we empower digital teams to move from reaction to readiness—combining agentic AI-powered SIEM and log analytics into a single platform to detect, investigate, and resolve modern challenges. Customers around the world rely on Sumo Logic for trusted insights to protect against security threats, ensure reliability, and gain powerful insights into their digital environments. For more information, visit www.sumologic.com.
Sumo Logic Privacy Policy. Employees will be responsible for complying with applicable federal privacy laws and regulations, as well as organizational policies related to data protection.