Machine Learning Engineer 
The proliferation of machine log data has the potential to give organizations unprecedented real-time visibility into their infrastructure and operations. With this opportunity comes tremendous technical challenges around ingesting, managing, and understanding high-volume streams of heterogeneous data
As a Machine Learning Engineer, you’ll build the intelligence behind the next generation of agentic AI systems that reason over massive, heterogeneous log data. You’ll combine machine learning, prompt engineering, and rigorous evaluation to create autonomous AI agents that help organizations understand and act on their data in real time.
You’ll be part of a small, high-impact team shaping how AI agents understand complex machine data. This is an opportunity to work on cutting-edge LLM infrastructure and contribute to defining best practices in context engineering and AI observability.
Responsibilities
- Design, implement, and optimize agentic AI components including context engineering, memory management, and prompts.
- Develop and maintain golden datasets by defining sourcing strategies, working with data vendors, and ensuring quality and representativeness at scale.
- Prototype and evaluate novel prompting strategies and reasoning chains for model reliability and interpretability.
- Collaborate cross-functionally with product, data, and infrastructure teams to deliver end-to-end AI-powered insights.
- Operate autonomously in a fast-paced, ambiguous environment - defining scope, setting milestones, and driving outcomes.
- Ensure reliability, performance, and observability of deployed agents through rigorous testing and continuous improvement.
- Maintain a strong bias for action—delivering incremental, well-tested improvements that directly enhance customer experience.
Required Qualifications
- B.Tech, M.Tech, or Ph.D. in Computer Science, Data Science, or a related field.
- 2-4 years of hands-on industry experience with demonstrable ownership and delivery.
- Strong understanding of machine learning fundamentals, data pipelines, and model evaluation.
- Proficiency in Python and ML/data libraries (NumPy, pandas, scikit-learn); familiarity with JVM languages is a plus.
- Working knowledge of LLM core concepts, prompt design, and agentic design patterns.
- Strong communication skills and a passion for shaping emerging AI paradigms.
Desired Qualifications
- Prior experience building and deploying AI agents or LLM applications in production.
- Familiarity with modern agentic AI frameworks (e.g., LangGraph, LangChain, CrewAI).
- Experience with ML infrastructure and tooling (PyTorch, MLflow, Airflow, Docker, AWS).
- Exposure to LLM Ops - infrastructure optimization, observability, latency, and cost monitoring.
About Us
Sumo Logic, Inc. empowers the people who power modern, digital business. Sumo Logic enables customers to deliver reliable and secure cloud-native applications through its Sumo Logic SaaS Analytics Log Platform, which helps practitioners and developers ensure application reliability, secure and protect against modern security threats, and gain insights into their cloud infrastructures. Customers worldwide rely on Sumo Logic to get powerful real-time analytics and insights across observability and security solutions for their cloud-native applications. For more information, visit www.sumologic.com.