Site Reliability Engineer

Ditto

Benefits

Unlimited PTO

Similar Jobs

See all

About the Role:

  • Develop and maintain observability solutions using platforms like Datadog, Prometheus and Grafana
  • Take a leading role in incident management, including coordinating response efforts and troubleshooting issues
  • Partner with product engineering teams to improve system resilience

What You'll Do:

  • Work with teams to implement and maintain SLOs, monitoring, and alerting strategies
  • Design and implement automation and support tooling to improve system resilience
  • Lead the development and maintenance of runbooks and incident response procedures

What You'll Need:

  • 6+ years of experience in Site Reliability Engineering or similar DevOps roles
  • Expertise with Infrastructure as Code tools, like Terraform and Helm
  • Strong communication skills and excellent problem-solving skills

Ditto

Ditto is redefining how data moves at the edge, aiming to make resilient, real-time applications seamless for developers, regardless of network conditions. It's a globally distributed and fast-growing startup with over $145 million in funding that is committed to building a diverse and inclusive team.

Apply for This Position