About the team/role
We're seeking an experienced Staff Data Engineer to join our Data Platform team. This team plays a crucial role in our mission by developing and maintaining scalable data platforms for fair and safe hiring decisions.
As a Staff Data Engineer on the Data Platform team, you'll work on Checkr’s centralized data platform, critical for the company's vision. The centralized data platform is the heart of all key customer facing products. You will work on high-impact projects that directly contribute to the next generation products.
What you’ll do:
- Architect, design, lead and build end-to-end performant, reliable, scalable data platform.
- Be an independent individual contributor who can solve problems and deliver high-quality solutions without oversight and a high level of ownership.
- Mentor, guide and work with junior engineers to deliver complex and next-generation of features.
- Bring a customer-centric, product-oriented mindset to the table - collaborate with customers and internal stakeholders to resolve product ambiguities and ship impactful features
- Partner with engineering, product, design, and other stakeholders in designing and architecting new features
- Experimentation mindset - autonomy and empowerment to validate a customer need, get team buy-in, and ship a rapid MVP
- Quality mindset - you insist on quality as a critical pillar of your software deliverables
- Analytical mindset - instrument and deploy new product experiments with a data-driven approach
- Architect, lead and build a performant, reliable, scalable data platform.
- Monitor, investigate, triage, and resolve production issues as they arise for services owned by the team
- Create and maintain data pipelines and foundational datasets to support product/business needs.
- Design and build database architectures with massive and complex data, balancing with computational load and cost
- Develop audits for data quality at scale, implementing alerting as necessary
- Create scalable dashboards and reports to support business objectives and enable data-driven decision-making
- Troubleshoot and resolve complex issues in production environments
What you bring:
- 10+ years of designing, implementing and delivering highly scalable and performant data platform.
- Experience building large-scale (100s of Terabytes and Petabytes) data processing pipelines - batch and stream
- Experience with ETL/ELT, stream and batch processing of data at scale
- Expert level proficiency in PySpark, Python, and SQL
- Expertise in data modeling, relational databases, NoSQL (such as MongoDB) data stores
- Experience with big data technologies such as Kafka, Spark, Iceberg, Datalake, and AWS stack (EKS, EMR, Serverless, Glue, Athena, S3, etc.)
- An understanding of Graph and Vector data stores (preferred)
- Knowledge of security best practices and data privacy concerns
- Strong problem-solving skills and attention to detail
- Experience/knowledge of data processing platforms such as Databricks or Snowflake.
What you get:
- A fast-paced and collaborative environment
- Learning and development allowance
- Competitive cash and equity compensation and opportunity for advancement
- 100% medical, dental, and vision coverage
- Up to $25K reimbursement for fertility, adoption, and parental planning services
- Flexible PTO policy
- Monthly wellness stipend, home office stipend
#LI-TD1