Remote Devops Jobs · SRE

Job listings

$160,000–$182,000/yr

In this role, you will provide strategic and technical leadership for multiple cloud engineering teams. You will drive initiatives that scale cloud-native platforms, enhance resilience, optimize costs, and improve operational excellence. This remote-first role offers an environment to influence large-scale infrastructure decisions and advance cloud engineering practices across the organization.

$138,600–$155,000/yr

The Program Manager will play a critical leadership role in accelerating CMS’ modernization of the Medicare Fee-for-Service (FFS) shared systems. This position supports the Operations & Site Reliability Engineering (OSRE) team. The Program Manager ensures OSRE’s work is delivered with a high bar for security, reliability, scalability, and human-centered design, while safeguarding uninterrupted Medicare claims processing.

Colombia 3w PTO

We’re looking for a Lead Site Reliability Engineer to join our platform team, someone who’s confident working hands-on with infrastructure, but also ready to shape how we scale and operate as a global team. You’ll take ownership of key systems, lead cross-functional work, and help evolve the way we build for performance, reliability, and security.

Empower people to achieve more with less friction. The platform simplifies complexity and the same principle drives how their systems are built and operated. Looking for a Staff Site Reliability Engineer who thrives at the intersection of infrastructure, automation, and AI-first thinking to design infrastructure that scales effortlessly, adapts intelligently, and makes everyone’s job easier.

As a Senior Site Reliability Engineer, you will partner with development teams to manage infrastructure, improve CI/CD pipelines, and support operational excellence across Growth and help ensure the reliability, scalability, and performance of the systems that power Kraken’s growth initiatives. You will bring your expertise in infrastructure, monitoring, and automation to ensure Kraken’s services are performant, resilient, and continuously improving.

This role is at the center of incident response, internal tooling, and platform reliability, acting as a Level 3 escalation point for Cybrid's production environment. You’ll work closely with engineering squads, product teams, and internal stakeholders like Compliance and Support to improve how we monitor, respond, and recover from platform events. This position is remote-first.

1w paternity

We’re looking for an Engineering Manager to provide strategic leadership and guidance for our Infrastructure Platform or SRE teams; this role is less about individual coding and more about enabling and empowering others to be wildly productive. You’ll be managing a handful of staff, collaborating with product partners, and taking accountability for the quality of our entire technology stack.

We're seeking an experienced Director of Platform Infrastructure to lead critical aspects of Docker's infrastructure platform. In this role, you'll oversee the reliability, efficiency, and developer experience of foundational systems that power Docker's products and services. You'll lead teams responsible for site reliability engineering, cloud cost optimization, internal developer tooling, and core infrastructure components.

As a Software Engineer, Customer Infrastructure, you will work closely with customers as well as the engineering team on enhancing, optimizing, validating and automating our cloud-native storage platform. Your role will be a mix of DevOps and software engineering to assure that MinIo is delivering a very high quality product with high-performance, scalability and durability to enable seamless data storage and retrieval for demanding workloads for customers.

$200,000–$250,000/yr

As a Staff Site Reliability Engineer at Topstep, you'll play a foundational role in shaping how we approach reliability, observability, and infrastructure at scale. You'll be instrumental in building out our SRE practice, defining our incident response culture, closing observability gaps, and optimizing our AWS infrastructure for both performance and cost.