Your role
As a Sr. Software Engineer in Observability, you’ll be responsible for our metrics and log collection platform. You’ll work closely with other Infrastructure engineers to determine resource usage and requirements. You’ll also help create tooling, libraries, and documentation that enable other engineers to instrument their own projects. In addition, you’ll keep our team aware of trends in the larger observability/monitoring industry.
This position reports to our Engineering Manager, Observability Platform, and has the opportunity to be based in our Bangalore, India office.
What you’ll do
- Develop and improve instrumentation for monitoring and logging the health and availability of services.
- Develop and maintain the observability stack within Dialpad engineering.
- Define best practices and standards around making systems and services measurable and work with various teams to get those best practices applied.
- Create tools and libraries for other engineering teams to enable them to build self-monitoring capabilities.
- Create and own internal documentation used by the other engineering teams.
- Stay up-to-date with the latest trends in observability, logging, monitoring, and cloud technologies. Introduce innovative solutions and best practices to improve system observability and reliability. Experiment with new tools and practices to enhance the observability landscape.
- Collaborate with different engineering teams to integrate observability practices into their workflows.
- Participate in a rotating on-call within the larger Infrastructure Engineering division.
Skills you’ll bring
- Background in both Systems and/or Software Engineering.
- Experience in designing, automating, maintaining, and optimizing observability platforms (logging, metrics, and tracing).
- Experience with configuration management tools such as Ansible, Terraform, etc.
- Experience with Public Cloud environments such as GCP, AWS, etc.
- Familiarity with languages such as Python, Go, Rust, etc.
Bonus skills you may have
- Previous direct experience with Grafana, Loki, Prometheus.
- Experience with Linux.
- Experience with Kubernetes (including GKE/EKS) and building containerized applications.
- Undergraduate degree in Computer Science or Engineering.