Enhance the reliability and performance of our Brokerage-as-a-Service platform during critical 7/24 operations in this Site Reliability Engineer 2 role. This role demands a proactive approach to managing technical challenges and system optimizations that align with global operational strategies. Support the SRE team in developing and implementing enhancements to support workflows, focusing on automation and efficiency improvements. Handle technical escalations and troubleshoot complex issues.
Remote Devops Jobs · US
131 results
FiltersJob listings
As an Observability Engineer, you’ll be at the forefront of building and evolving the systems that power deep, actionable visibility across our entire stack. Your work will enable hundreds of engineers to proactively detect, diagnose, and resolve issues before they reach our customers. You’ll define and execute observability strategy using modern tools like Datadog, OpenTelemetry, and CloudTrail within an AWS-native ecosystem.
As the Lead Platform Engineer on our team, you’ll be driving forward the technical architecture across platform infrastructure, DevOps and security, establishing best practices as well as contributing with hands-on work. You’ll be designing and building new services and workflows as well as enhancing and improving existing ones. Collaborating with Engineering leaders, Product, and your fellow Engineers, you’ll help make decisions and lead projects that will lay the groundwork for Reach to help millions of Americans outsmart debt for good.
The selected candidate will be responsible for designing, implementing, and maintaining automated, scalable, and secure cloud-based infrastructure solutions. This role spans the entire software development lifecycle with a focus on automation, continuous integration, and system reliability. The candidate must reside within the continental US.
Cribl Inc is seeking a Senior Site Reliability Engineer to join our mission where you will unlock the value of all observability data. The SRE engineers are involved from conception to design to development and all the way through production and beyond, providing creative input into all things Cloud, Scaling, Reliability, High Availability and much more.
Drive the technical direction of the data and AI training infrastructure team, shaping the vision and architecture of the AI research platform. You'll enhance and scale the infrastructure to dramatically improve training velocity and research outcomes. As a technical lead, bring expertise in AI infrastructure and software engineering best practices to develop scalable systems and optimize workflows.
Be part of a growing team in the federal sector as a DevOps Engineer at Agile Defense, supporting applications in the AWS cloud. You will be responsible for the development, testing, and maintenance of automation scripts, infrastructure tools, DSLs CI/CD pipelines utilizing Ruby and related technologies.
Design and implement secure, scalable cloud solutions. Work hands-on with VPC, Private Service Connect, Google Security Center, and other GCP infrastructure and security services, applying expertise in Terraform, container security, and data encryption. Apply expertise in automation, security architecture, and regulatory compliance.
Shape the foundation of Site Reliability Engineering at Boulevard. You will improve systems at scale, influence reliability across engineering, and drive a reliability strategy. You’ll help teams establish SLOs and build repeatable practices for how teams observe, debug, and improve their services.
Seeking a Senior DevSecOps Engineer to lead the design, automation, and operation of secure delivery pipelines for large-scale federal health systems. In this role, you’ll own the secure delivery platform that supports cloud-native applications, ensuring scalability, compliance, and resiliency across the full software lifecycle.