Looking for an experienced MLOps Engineer to build and scale machine learning solutions that address critical challenges in the healthcare revenue cycle. Focus on operationalizing ML models, ensuring deployment pipelines, and maintaining scalable, secure, and ML infrastructure on AWS, collaborate with data scientists, software engineers, and product teams to bring ML products from prototype to production, with a emphasis on automation, monitoring, and continuous improvement.
Remote Devops Jobs
317 results
FiltersJob listings
The Senior Site Reliability Engineer role within the Cloud Compute team is pivotal in ensuring the robust and scalable foundation of Affirm's platform. This role manages all of Affirm's Kubernetes clusters. The mission is to provide a highly reliable and available cloud environment that empowers all of Affirm's engineering teams to build and deploy innovative solutions seamlessly. The engineer will drive initiatives to enhance observability capabilities, fortify the reliability of critical infrastructure, and automate key operational workflows.
Overture Maps Foundation seeks a Senior DevOps Engineer for open data pipeline reliability, automation, and observability. Manage geospatial data pipelines across cloud environments, design CI/CD workflows in GitHub, and implement deployment best practices. Be at the center of Overture’s technical ecosystem, ensuring pipelines run smoothly.
Solve unique, challenging problems for our Defense and Homeland Security customers. Help manage large volumes of critical, real-time information about issues from global to local. Evolving and sustaining the Continuous Integration / Continuous Delivery (CI/CD) Infrastructure for our DOD customer and developing Infrastructure as Code with tools such as Ansible, Powershell Desired State Configuration (DSC) and Terraform to solve deployment problems for diverse environments.
Be part of a rapidly growing team that’s responsible for design, build and delivery of the application stacks for product teams in a cloud-based environment. You will be working closely with software and QA engineers, to build and maintain the right cloud infrastructure, balancing performance and resilience with cost. As part of a growing team focused on emergent technologies, you will create CI/CD pipelines and work across functions.
As a Site Reliability Engineer at Axiom, you will be pivotal in upholding our promise of superior reliability and performance to our customers. Collaborating with backend engineers and product teams, you will emphasize creating and operating scalable and reliable systems. Axiom's emphasis on SREs revolves around automating, measuring, and continuously improving the reliability and efficiency of our systems.
The Network Engineering team at New Relic are the gurus of cloud platforms and management systems that New Relic's services are built upon. We develop the software and tools to ensure our network is available and scalable. You will support the operations side of the Network team including but not limited to: cloud network deployment, administration, and troubleshooting.
Architect and scale observability systems by leading the design and evolution of logging, metrics, and tracing pipelines to handle massive data volumes. Evaluate and integrate new technologies that enhance Airtable’s observability posture. Guide and mentor a growing team of infrastructure engineers and partner with teams to embed observability deeply into Airtable’s development lifecycle.
Join the Development Tooling team to help maximize the efficiency and productivity of GitLab engineers. You'll be crucial in building and maintaining the foundational internal tools, frameworks, and infrastructure abstractions that empower all product developers to write, test, and ship reliable code faster. This role is ideal for a hands-on engineer who thrives on solving other engineers' pain points, enjoys working across different technology stacks, and has a passion for improving the software development lifecycle (SDLC).
As the Senior Site Reliability Engineer, you will lead Branch’s effort to achieve greater reliability, scalability, capacity and observability of our platform through automation and tooling. You will also participate in and improve the software development and deployment life cycles as well as develop and implement technical best practices.