Why This Job is Featured on The SaaS Jobs
Voice AI is becoming a core layer in modern SaaS products, from support automation to multimodal assistants, and that shift is increasing demand for reliable speech and human feedback data. This role sits in the operational engine that supplies those inputs, coordinating annotation and QC work that directly affects model behavior in TTS, STT, and RLHF workflows. The positioning as a specialist data services partner working with a leading voice AI lab makes the work closely tied to production-grade expectations rather than exploratory research.
For a SaaS career path, the value is in learning how AI capabilities are industrialized: translating evolving specifications into repeatable processes, managing quality signals, and shipping datasets on predictable cadences. The emphasis on throughput, error patterns, and reporting builds fluency with metrics-driven delivery, a skill set that transfers to SaaS functions where reliability and iteration loops matter, including AI operations, product operations, and customer-facing technical programs.
This is best suited to someone who prefers structured execution and finds satisfaction in operational detail, from guidelines to dashboards. It also fits professionals who enjoy leading small teams and acting as the coordination point between external stakeholders and internal delivery, especially in environments where requirements change and clarity has to be created through process.
The section above is editorial commentary from The SaaS Jobs, provided to help SaaS professionals understand the role in a broader industry context.
Job Description
About Arctic Engines
Arctic Engines is a specialized AI data services partner working with leading voice and speech AI companies. We deliver high-quality annotated datasets and human feedback workflows that train the next generation of voice models. We are currently expanding our delivery team to support our work, one of the world's foremost voice AI labs.
About the Role
We are looking for a hands-on Project Manager to own end-to-end delivery of voice annotation projects — including RLHF (Reinforcement Learning from Human Feedback), Text-to-Speech (TTS), Speech-to-Text (STT), and custom speech tasks. You will lead a pod of 5–15 annotators and QCs, drive quality and throughput targets, and act as the operational bridge between our delivery team and the client.
This is a delivery-heavy role for someone who has lived in spreadsheets, guideline documents, and QC reports — and enjoys it.
What You'll Do
- Own project delivery for voice annotation workstreams (RLHF, TTS, STT, custom) — daily targets, weekly milestones, and final dataset hand-off.
- Manage a team of 5–15 annotators and QCs — task allocation, shift planning, capacity utilization, performance reviews, and ramp-up of new joiners.
- Drive quality outcomes — partner with QC leads to monitor IAA (Inter-Annotator Agreement), error rates, and rejection patterns; run calibration sessions when scores drift.
- Translate client guidelines into operational SOPs — break down complex annotation specs into clear, repeatable instructions for the floor.
- Handle client coordination — receive guideline changes, rework requests, and clarifications from the client (escalations are co-handled with the founder); communicate impact back internally.
- Track and report — maintain dashboards on volume, quality, productivity, and project P&L; publish weekly status to leadership and client.
Continuously improve — identify bottlenecks, propose tooling or process changes, and run small experiments to lift throughput or quality.
Must-have
- 3–5 years of total experience, with 2+ years managing voice/speech annotation projects (TTS, STT, ASR, voice RLHF, prosody, phonetic transcription, or similar).
- Direct experience working with annotation platforms (Label Studio, CVAT, Prodigy, internal client tooling, etc.).
- Demonstrated track record of managing annotator pods (10+ people) and hitting quality and volume SLAs.
- Strong working knowledge of QC frameworks — IAA, gold sets, sampling-based audits, rework cycles.
- Comfort with spreadsheets/Google Sheets, basic data analysis, and reporting dashboards.
- Excellent written and spoken English — you'll be reading dense guideline documents and writing clear instructions.
Nice-to-have
- Exposure to RLHF preference labeling or human feedback for generative models.
- Familiarity with audio quality concepts (SNR, prosody, phoneme alignment, speaker diarization).
- Experience working with US/EU AI labs as clients.
- Working knowledge of multiple Indian or global languages (useful for multilingual voice projects).