As a Software Engineer II on the Site Reliability Engineering team within the Platform Engineering group at Checkr, you will identify reliability challenges impacting engineering teams and platforms and develop innovative solutions to resolve them. You will strive towards the right balance between enforcing standardization and accommodating tailored workflows. The person in this role will have the opportunity to demonstrate a degree of autonomy in their work and help with complex support requests, while having an impact across engineering.
What you’ll do:
- Design, build, ship, and maintain the core observability libraries, tools, and patterns used by all of Checkr’s engineering teams
- Troubleshoot complex production issues across the stack, with respect to performance, availability, and data quality
- Participate in a cross-organization incident response team, driving continuous improvement
- Contribute to architectural discussions within the SRE team and with cross-functional teams
- Influence cross-team projects and the reliability roadmap to enable engineering and help Checkr customers
- Provide consultation and feedback across teams to ensure we are building highly reliable, efficient, and scalable systems
What you bring:
- Bachelor’s degree in Computer Science or related field, or equivalent practical experience
- 2+ years of software engineering experience, including 1+ years focused on reliability, scalability, and efficiency of distributed systems
- Proficiency in Python (preferred), Go, or Ruby within Linux environments, and strong understanding of microservices, asynchronous systems, and remote APIs.
- Experience developing and operating production, customer-facing systems in AWS or Azure using Kubernetes, Docker, and Terraform.
- Skilled in observability and incident response practices using tools such as Datadog, Splunk, Grafana, Prometheus, and OpenTelemetry, with a focus on continuous improvement.
- Strong collaboration, documentation, and communication skills, with experience leading small projects, promoting platform adoption, and fostering a self-service, product-first mindset.
- An A-player mindset with a strong bias for action: you raise the bar, move with urgency, stay resilient through ambiguity, and take ownership to deliver meaningful outcomes.
What you’ll get:
- A fast-paced and collaborative environment
- Learning and development allowance
- Competitive cash and equity compensation, and opportunity for advancement
- 100% medical, dental, and vision coverage
- Up to $25K reimbursement for fertility, adoption, and parental planning services
- Flexible PTO policy
- Monthly wellness stipend