Why This Job is Featured on The SaaS Jobs
This Senior Data Engineer role stands out in SaaS because it sits at the intersection of product delivery and platform capability. The remit is centered on building batch and streaming pipelines that feed a centralized data platform, the kind of shared infrastructure that enables consistent analytics, ML use cases, and data-backed customer features across a subscription software business. The tooling mix of PySpark, SQL, and AWS points to a modern cloud data stack rather than a purely warehouse-centric setup.
From a SaaS career perspective, the work offers durable experience in turning ambiguous business asks into reliable data services. Owning complex pipeline delivery, participating in system design discussions, and improving observability and data quality are all competencies that translate across SaaS companies as data becomes more embedded in product and go-to-market decisions. The emphasis on production support and reliability also builds the operational discipline expected of senior engineers working on shared platforms.
The role is best suited to an engineer who prefers end-to-end execution and is comfortable collaborating across product, analytics, and engineering. It will appeal to someone who enjoys designing for scale, debugging performance issues, and treating data pipelines as long-lived systems with testing and monitoring, not one-off integrations.
The section above is editorial commentary from The SaaS Jobs, provided to help SaaS professionals understand the role in a broader industry context.
Job Description
About the Role
We are seeking a strong Senior Data Engineer to build and maintain scalable, high-quality data pipelines powering Checkr’s centralized data platform. As a Senior Engineer, you will independently deliver complex features, contribute to system design, and collaborate with cross-functional partners to support the next generation of our data products.
What You’ll Do
- Independently design and implement complex batch and streaming pipelines using PySpark, SQL, and AWS services.
- Navigate ambiguity with guidance, translating high-level direction into well-scoped, high-quality technical solutions.
- Work cross-functionally with product, design, analysts, and engineers to ship impactful features and improve data workflows.
- Contribute to architectural discussions and system improvements without owning long-term strategy.
- Ensure pipeline reliability and data quality, implementing testing, monitoring, and observability best practices.
- Investigate and resolve production issues for services owned by the team.
- Write performant, maintainable code that aligns with engineering standards.
- Support the team in building foundational datasets that enable analytics, ML, and customer-facing features.
What You Bring
- 6–7+ years of experience in data engineering with strong hands-on execution ability.
- Proficiency with PySpark, Python, and SQL, including debugging and performance optimization.
- Experience building large-scale pipelines (up to terabytes or larger), with exposure to streaming systems such as Kafka.
- Strong knowledge of data modeling, relational databases, and NoSQL stores.
- Experience with AWS services such as EMR, Glue, Athena, Lambda, and S3.
- Exposure to Iceberg or other lakehouse technologies (nice to have).
- Understanding of security and data privacy fundamentals.
- Strong problem-solving skills, attention to detail, and ability to execute independently.
- Knowledge of Databricks, Snowflake, or Graph/Vector stores is a plus.