Remote Devops Jobs · DataDog

Job listings

As a Senior Site Reliability Engineer, you will partner with the Engineering Department to drive the reliability, scalability, and performance of our production systems. You will define and implement best practices across infrastructure security, observability, release engineering, and developer tooling to meet department-level operational requirements, own our Incident Management process and automate operational tasks.

$131,325–$201,000/yr

As a founding member of the Site Reliability Engineering (SRE) team, helps define the culture and build the systems that keep regulated, cloud-based production environments reliable. Designs, implements, and operates observability, reliability, and incident management systems. Partners with engineering teams to define SLIs, SLOs, and error budgets, build runbooks and operational playbooks, and develop the monitoring and automation needed to ensure systems are reliable and compliant.

$180,000–$215,000/yr
US Unlimited PTO

Combine technical excellence and exceptional collaboration skills to deliver impact. You will drive vital initiatives including deployment velocity, observability, system reliability, and developer experience. You'll build the infrastructure and tooling that lets engineers ship faster and sleep better—think GitOps workflows, Kubernetes configuration and optimization, comprehensive Datadog observability coverage, and the kind of automation that makes deployments uneventful.

$200,000–$250,000/yr

As a Staff Site Reliability Engineer at Topstep, you'll play a foundational role in shaping how we approach reliability, observability, and infrastructure at scale. You'll be instrumental in building out our SRE practice, defining our incident response culture, closing observability gaps, and optimizing our AWS infrastructure for both performance and cost.

$170,000–$195,000/yr
US 12w maternity

Huntress is growing our Platform Engineering team and is looking for an experienced engineer who is passionate about stability, resilience, and scalability. You’ll be joining a high-performing team responsible for proactively building, monitoring, and implementing the infrastructure that is a part of the Huntress Security Platform, providing a first-class development platform to our developers, and tracking and supporting the complete lifecycle of our millions of installed endpoint agents.