Source Job

China

  • Manage cloud infrastructure and deployment pipelines for production systems.
  • Design and improve CI/CD processes to make deployments safer and faster.
  • Improve monitoring, alerting, and system observability across services.

CI/CD Cloud Platforms Monitoring Incident Management Infrastructure As Code

20 jobs similar to DevOps Engineer

Jobs ranked by similarity.

China

  • Build and maintain a highly reliable platform for BJAK's AI automation systems.
  • Manage cloud infrastructure, deployment pipelines, and CI/CD workflows.
  • Improve system resilience, monitoring, and incident response.

BJAK’s automation systems support customer journeys across quote generation, policy issuance, claims, payments, renewals and insurer integrations. They are a global engineering team with a focus on reliability, ownership, and modern engineering culture.

China

  • Own reliability and operational stability of BJAK’s production systems.
  • Design and improve monitoring, alerting, logging and observability across services.
  • Lead incident response, troubleshooting and structured root cause analysis.

BJAK's automation systems power end-to-end insurance journeys across quote generation, policy issuance, claims, and more. They are a global engineering team with modern engineering culture, offering fully remote work and a high-ownership environment.

$116,000–$128,800/yr
US

  • Own and maintain the reliability, performance, and availability of large-scale production systems.
  • Design, build, and improve CI/CD pipelines using Azure DevOps, GitHub Actions, Jenkins, and Octopus Deploy.
  • Drive cloud cost optimization, scalability, and auto-scaling initiatives across hosted environments.

Encoura empowers students and institutions to create meaningful connections so everyone can make the most informed decisions to achieve their goals. Since 1972, the company has evolved its products and services to better represent the link between students and higher education institutions and to create the highest probability of student success.

US

  • Owning cloud infrastructure on Azure, data pipeline orchestration, CI/CD, and observability to ensure production-grade reliability.
  • Building and maintaining foundational infrastructure that enables fast engineering velocity without breaking things.
  • Applying SRE principles such as SLOs, capacity planning, incident response, and eliminating toil through automation.

Terzo's platform processes enterprise-scale document corpora, powers real-time AI agents, and serves the Financial Intelligence Graph to Fortune 500 customers. As a small, senior team with strong ownership and minimal bureaucracy, we foster a culture of collaboration, mentorship, and continuous improvement.

United States

  • Lead a high-impact CloudOps and infrastructure engineering team powering large-scale, real-time advertising systems under extreme performance and reliability constraints.
  • Own planning and delivery processes including sprint planning, backlog prioritization, execution tracking, and team retrospectives.
  • Drive initiatives to improve system reliability, observability, deployment safety, incident response, and production readiness.

Jobgether uses an AI-powered matching process to review applications quickly, objectively, and fairly against role requirements. Their platform identifies top-fitting candidates and shares shortlists directly with hiring companies.

China

  • Lead design and delivery of core platform systems for AI-driven insurance automation.
  • Translate complex business needs into scalable backend architecture and APIs.
  • Mentor engineers and ensure system reliability, maintainability, and observability.

BJAK uses AI, automation, and backend systems to power end-to-end insurance operations. We are a growing global team with a modern engineering culture focused on reliability, scalability, and excellence.

  • Own reliability, latency, and performance for AI platform services and data infrastructure on AWS.
  • Design and maintain CI/CD pipelines, infrastructure-as-code, and observability frameworks across the stack.
  • Partner with AI and data engineers to ensure secure, cost-optimized, and scalable deployment of platform components.

HHAeXchange is the leading technology platform for home and community-based care, providing an end-to-end homecare solution for people who are aging or have disabilities. Founded in 2008, the company is passionate about transforming healthcare by connecting patients, providers, managed care organizations, and states.

UK

  • Design, build, and maintain CI/CD pipelines and Infrastructure as Code using tools like CloudFormation, Ansible, and Terraform.
  • Monitor and respond to infrastructure and application health, troubleshoot operational issues, and provide on-call support.
  • Maintain operational documentation, communicate proactively with teams, and ensure service delivery meets client expectations.

NICE Ltd. provides software used by 25,000+ global businesses, including 85 of the Fortune 100, to deliver customer experiences, fight financial crime, and ensure public safety. With over 8,500 employees across 30+ countries, NICE is recognized as a market leader in AI, cloud, and digital innovation.

Germany 6w PTO

  • Architect and scale the cloud platform behind a mission-critical SaaS product used globally.
  • Lead Infrastructure as Code maturity and drive automation, reliability, and cost optimisation.
  • Own uptime, SLAs, and incident management practices while mentoring engineers.

Innocraft (trading as Matomo) provides an open-source analytics platform trusted by enterprises and governments for full data ownership. The company values diversity and inclusion, and operates with a stable, mature product and strong engineering team.

US Unlimited PTO

  • Design and build cloud-native infrastructure for reliability, observability, and automation across GCP, GKE, and Cloud Run.
  • Own incident response, root cause analysis, escalation workflows, and runbooks to prevent hard problems from recurring.
  • Develop Infrastructure as Code, CI/CD pipelines, and operational tooling to improve developer velocity and platform efficiency.

CertifyOS is building the data infrastructure that powers modern healthcare, automating provider licensing, enrollment, credentialing, and network monitoring through an API-first platform. The company is backed by leading investors with a team of deep experience in provider data systems, valuing authenticity, accountability, collaboration, results, and openness to feedback.

US

  • Lead design and operation of internal developer platforms and self-service infrastructure.
  • Build and optimize CI/CD pipelines, deployment workflows, and automation across GitHub Actions, Jenkins, ArgoCD.
  • Apply SRE principles to improve developer-facing systems and software delivery performance.

Versant is a media company owning iconic brands in news, sports, and entertainment, including USA Network, Fandango, and Rotten Tomatoes. It is an independent, publicly traded company with a collaborative, inclusive culture and a remote-first work environment.

US

  • Design and implement scalable infrastructure solutions using AWS services and Infrastructure as Code.
  • Build robust CI/CD pipelines with comprehensive testing and automated deployment capabilities.
  • Develop monitoring and alerting systems using CloudWatch, Splunk, and modern observability tools.

U.S. FinTech built and operates the largest and most advanced mortgage securitization platform in the world, supporting Fannie Mae and Freddie Mac. The company supports 70% of the market with a cloud-based platform and a team that combines financial expertise with technological innovation.

Europe

  • Design, build, and maintain scalable cloud infrastructure for an AI-powered platform.
  • Manage and optimize AWS environments, develop Infrastructure as Code using Terraform, and build CI/CD pipelines.
  • Troubleshoot production issues and implement security best practices across infrastructure and deployment pipelines.

US Canada

  • Own and evolve AWS infrastructure using Terraform, managing EKS clusters, databases, and core services.
  • Maintain CI/CD reliability and developer tooling across the full engineering org.
  • Lead incident response, drive post-incident reviews, and improve monitoring and alerting standards.

Babylist is the leading platform for expecting and new families, helping parents feel confident, connected, and cared for at every step. As a modern, AI-forward tech company with over 10 million yearly shoppers, Babylist has expanded into a full ecosystem and generated $750M in revenue in 2025, reshaping the $235B kids and baby market.

US 5w PTO

  • Design and develop CI/CD systems for websites, services, and release workflows, and operate an EKS-based Kubernetes platform.
  • Diagnose debug production incidents, drive root-cause analysis, and implement improvements to enhance system reliability.
  • Write and maintain infrastructure as code using Pulumi or Terraform/OpenTofu across multiple AWS accounts with security-conscious practices.

Thunderbird is one of the world’s most trusted open-source email applications, empowering more than 20 million people globally. Our small but growing distributed team includes 65+ people across seven countries, and we build privacy-respecting communication tools with a collaborative, inclusive, and user-first spirit.

China

  • Design and build platform components that support AI-driven workflow automation systems.
  • Develop shared infrastructure and services for workflow orchestration, state management and execution tracking.
  • Improve APIs, service frameworks and backend standards used across multiple engineering teams.

BJAK’s automation systems power end-to-end insurance journeys across quote generation, policy issuance, renewals, endorsements, claims, payments and insurer integrations. They have a global engineering team working across multiple countries and offer a fully remote, high-ownership environment with a focus on scalability and reliability.

United States 4w PTO

  • Own and improve infrastructure, deployment systems, and operational foundation for reliability and security.
  • Build safer deployment paths, strengthen observability, and lead infrastructure migrations.
  • Partner with engineers on scaling, error handling, and backend changes to support AI-enabled workflows.

Clever is a venture-backed real estate technology company that builds a leading online education platform and has earned a 4.9 TrustPilot rating. The company has helped consumers save over $210 million in real estate fees and fosters a culture of innovation and transparency.

Europe

  • Build and operate secure agent runtimes with sandboxing, runtime isolation, and RBAC.
  • Design and maintain integration surfaces with MCP-style adapters and gateways across marketplace teams.
  • Implement observability and cost control including traces, telemetry, and cost-per-workflow.

Zartis is a global AI transformation and technology consulting partner that designs, builds, and scales technology solutions for ambitious organizations. With engineering hubs across EMEA and LATAM and long-term partnerships in financial services, healthcare, and energy, they foster an inclusive culture based on trust and innovation.

US Canada Unlimited PTO

  • Identify and eliminate bottlenecks across engineering and the business using DevOps and agile thinking.
  • Build and maintain CI/CD pipelines and infrastructure-as-code, and harden AI-generated apps from non-engineering teams.
  • Strengthen DevSecOps practices including scanning, vulnerability management, and compliance workflows.

Mangomint is a fast-growing SaaS company on a mission to make every salon and spa more profitable. They are a primarily remote, ambitious, and collaborative team with thousands of customers, aiming to become the #1 market leader.

Latin America

  • Build and operate the self-service infrastructure platform for developers and AI agents.
  • Own core platform layers including CI/CD, GitOps, IaC module catalog, and golden-path scaffolding.
  • Build internal tooling, observability, and metrics to make pipelines observable and improvable.

Luxury Presence is building the AI growth platform for real estate. Backed by top investors like Bessemer Venture Partners, we're a Series C company with over $100M in ARR and more than 90,000 real estate professionals using our platform.