The Machine Learning Engineer will build production-grade NLP models for data privacy, manage the ML lifecycle, and collaborate across teams.
Summary Generated by Built In
Tonic.ai is looking for a hands-on Machine Learning Engineer to help build production-grade NLP systems that power our data privacy and information extraction products. You'll join a small, experienced team working at the intersection of LLMs, data privacy, and applied AI — developing and fine-tuning models that detect and redact sensitive information across diverse datasets.
What You’ll Do Build and ship models. Fine-tune and evaluate transformer-based models (e.g., RoBERTa, Gemma, LLaMA) to support PII redaction, entity extraction, and synthetic data generation.
Own the ML lifecycle. From dataset curation and experiment tracking to model deployment and monitoring — you’ll own the full path from prototype to production.
Collaborate cross-functionally. Partner with Product and Design to shape how ML models drive user-facing features, and work with the broader engineering team to integrate them into scalable systems.
Experiment responsibly. Document your experiments, evaluate results rigorously, and help push the frontier of safe and explainable AI for data privacy.
What You’ll Bring 3+ years of professional experience in applied ML or data science with a focus on NLP
Proficiency in Python and deep learning frameworks such as PyTorch and Hugging Face Transformers
Hands-on experience with experiment tracking (e.g., Weights & Biases), distributed training (e.g., Accelerate), and model serving (e.g., vLLM)
Comfort working independently and iterating quickly — you enjoy the mix of research, engineering, and product thinking
Strong communication and collaboration skills
Bonus Points For: Experience with supervised and reinforcement learning fine-tuning (e.g. TRL)
Familiarity with data privacy, PII redaction, or healthcare data
A public portfolio, blog, or open-source contributions that demonstrate your technical depth and curiosity
Why You’ll Love It Here High autonomy and meaningful ownership — your models will ship to production, not sit in a notebook
Small, collaborative team with deep expertise in NLP and privacy
Opportunity to work with real-world, high-impact data in domains like healthcare and financial services
Benefits Competitive salary and company equity
Unlimited PTO and generous parental leave
Medical, dental, and vision insurance
401(k) with employer contribution
Remote-friendly work environment
About Tonic.aiTonic.ai creates safe, high-quality synthetic data that helps developers move fast while protecting sensitive information. Thousands of engineers rely on Tonic-generated data daily to power development, testing, and CI/CD pipelines across industries including healthcare, financial services, logistics, and education. We’re growing fast and looking for builders who want to make privacy practical.