Source Job

US

  • Lead design and operation of internal developer platforms and self-service infrastructure.
  • Build and optimize CI/CD pipelines, deployment workflows, and automation across GitHub Actions, Jenkins, ArgoCD.
  • Apply SRE principles to improve developer-facing systems and software delivery performance.

AWS Terraform Python Kubernetes CI/CD

20 jobs similar to Sr. Site Reliability Engineer

Jobs ranked by similarity.

US

  • Design, provision, and manage AWS infrastructure using Terraform and Kubernetes.
  • Build, operate, and improve observability, monitoring, and incident response processes.
  • Collaborate with engineering teams on capacity planning, performance optimization, and resilient system design.

Vynca provides comprehensive care for individuals with complex needs, focusing on quality days at home. The company is a close-knit community guided by core values of Excellence, Compassion, Curiosity, and Integrity.

  • Own reliability, latency, and performance for AI platform services and data infrastructure on AWS.
  • Design and maintain CI/CD pipelines, infrastructure-as-code, and observability frameworks across the stack.
  • Partner with AI and data engineers to ensure secure, cost-optimized, and scalable deployment of platform components.

HHAeXchange is the leading technology platform for home and community-based care, providing an end-to-end homecare solution for people who are aging or have disabilities. Founded in 2008, the company is passionate about transforming healthcare by connecting patients, providers, managed care organizations, and states.

US 5w PTO

  • Design and develop CI/CD systems for websites, services, and release workflows, and operate an EKS-based Kubernetes platform.
  • Diagnose debug production incidents, drive root-cause analysis, and implement improvements to enhance system reliability.
  • Write and maintain infrastructure as code using Pulumi or Terraform/OpenTofu across multiple AWS accounts with security-conscious practices.

Thunderbird is one of the world’s most trusted open-source email applications, empowering more than 20 million people globally. Our small but growing distributed team includes 65+ people across seven countries, and we build privacy-respecting communication tools with a collaborative, inclusive, and user-first spirit.

US Unlimited PTO

  • Design, scale, and operate resilient, cloud-native infrastructure in AWS with a strong emphasis on EKS, IAM, RBAC, and modern security-first practices.
  • Build and optimize CI/CD pipelines with GitHub Actions and GitHub Advanced Security, enabling velocity without compromising safety.
  • Own observability across the stack using Datadog (metrics, logging, alerting, and tracing).

DexCare optimizes time in healthcare, streamlining patient access, reducing waits, and enhancing overall experiences. Currently serving 57 million patients, including Kaiser Permanente and Providence, DexCare is committed to an inclusive workplace where diversity drives innovation.

US Canada

  • Own and evolve AWS infrastructure using Terraform, managing EKS clusters, databases, and core services.
  • Maintain CI/CD reliability and developer tooling across the full engineering org.
  • Lead incident response, drive post-incident reviews, and improve monitoring and alerting standards.

Babylist is the leading platform for expecting and new families, helping parents feel confident, connected, and cared for at every step. As a modern, AI-forward tech company with over 10 million yearly shoppers, Babylist has expanded into a full ecosystem and generated $750M in revenue in 2025, reshaping the $235B kids and baby market.

Latin America

  • Build and operate the self-service infrastructure platform for developers and AI agents.
  • Own core platform layers including CI/CD, GitOps, IaC module catalog, and golden-path scaffolding.
  • Build internal tooling, observability, and metrics to make pipelines observable and improvable.

Luxury Presence is building the AI growth platform for real estate. Backed by top investors like Bessemer Venture Partners, we're a Series C company with over $100M in ARR and more than 90,000 real estate professionals using our platform.

UK

  • Design, build, and maintain CI/CD pipelines and Infrastructure as Code using tools like CloudFormation, Ansible, and Terraform.
  • Monitor and respond to infrastructure and application health, troubleshoot operational issues, and provide on-call support.
  • Maintain operational documentation, communicate proactively with teams, and ensure service delivery meets client expectations.

NICE Ltd. provides software used by 25,000+ global businesses, including 85 of the Fortune 100, to deliver customer experiences, fight financial crime, and ensure public safety. With over 8,500 employees across 30+ countries, NICE is recognized as a market leader in AI, cloud, and digital innovation.

Europe

  • Design, build, and maintain scalable cloud infrastructure for an AI-powered platform.
  • Manage and optimize AWS environments, develop Infrastructure as Code using Terraform, and build CI/CD pipelines.
  • Troubleshoot production issues and implement security best practices across infrastructure and deployment pipelines.

Canada

  • Build and maintain infrastructure platforms for over 200 backend services running on Kubernetes clusters with 40,000+ cores.
  • Lead and mentor other engineers, own complex infrastructure failures, and participate in a shared on-call rotation.
  • Drive cloud cost efficiency, estimate schedules, and use AI tools as a first-class collaborator in daily workflows.

Life360's mission is to keep people close to the ones they love through location sharing, safe driver reports, and crash detection. The company serves approximately 97.8 million monthly active users across more than 180 countries and has more than 500 remote-first employees.

Global Unlimited PTO 16w maternity 16w paternity

  • Own the operational excellence and infrastructure strategy for Remote Build's platform, ensuring reliability, performance, and security.
  • Lead incident response, build observability systems, and drive continuous improvement in system reliability.
  • Embed security into infrastructure, optimize costs, and automate operational toil to scale efficiently.

Remote solves modern organizations' biggest challenge of navigating global employment compliantly. With a fully distributed team across 6 continents, the company fosters a future-focused culture with core values of innovation and async work.

US

  • Designing and managing cloud-based infrastructure on AWS.
  • Creating and maintaining deployment architectures and continuous delivery pipelines.
  • Automating infrastructure provisioning and management using Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.

Nearform is an independent team of data & AI experts, engineers, and designers who build intelligent digital solutions and capability at pace. Our team of 500 experts in 20+ countries is trusted by leading enterprises.

Germany 6w PTO

  • Architect and scale the cloud platform behind a mission-critical SaaS product used globally.
  • Lead Infrastructure as Code maturity and drive automation, reliability, and cost optimisation.
  • Own uptime, SLAs, and incident management practices while mentoring engineers.

Innocraft (trading as Matomo) provides an open-source analytics platform trusted by enterprises and governments for full data ownership. The company values diversity and inclusion, and operates with a stable, mature product and strong engineering team.

$115,200–$172,800/yr
US 8w paternity

  • Build internal tooling to help other engineers and the rest of the company understand and operate our system.
  • Design and implement security best practices for our team and infrastructure.
  • Reduce toil through automation, including building and maintaining CI/CD infrastructure.

Openly is rebuilding insurance from the ground up by re-envisioning and enhancing every aspect of the customer experience. They are a rapidly growing team of exceptional, curious, empathetic people with a wide range of skill sets, spanning many departments.

  • Maintain and develop secure, reliable, and scalable AWS cloud infrastructure to meet business and development needs.
  • Deploy and operate microservices running on EC2 (Docker Compose + Caddy) and Kubernetes (EKS + Karpenter).
  • Write and maintain Terraform modules and stacks for EC2, RDS, EKS, ECR, S3, IAM, VPC, and Secrets Manager.

INFUSE is a digital marketing company headquartered in the US and operating worldwide, providing services in demand generation. Our team is dispersed across 20 countries, and we are committed to giving each candidate a fair and detailed assessment.

Global

  • Deploy, manage, and maintain AWS infrastructure across development, staging, and production environments.
  • Build and maintain scalable, reusable and secure Infrastructure as Code (IaC) using Terraform Enterprise.
  • Develop, implement and manage CI/CD pipelines for automated application and infrastructure deployments.

Miratech helps visionaries change the world. We are a global IT services and consulting company that brings together enterprise and start-up innovation. They retain nearly 1000 full-time professionals, and their annual growth rate exceeds 25%.

SRE

Fal
$180,000–$250,000/yr
US

  • Own and operate our Kubernetes infrastructure.
  • Build and maintain CI/CD pipelines and deployment infrastructure.
  • Leverage AI to automate analysis and resolution of production issues.

Fal is the generative media ecosystem powering the next generation of AI products. They build the infrastructure, tools, and model access that teams need to move from idea to production.

$4,313–$5,391/mo
Europe

  • Own 5 AWS accounts across the organisation.
  • Architect and maintain infrastructure as code with Terraform.
  • Set up monitoring, alerting, and incident response.

We're a UK fintech building high-throughput digital infrastructure for the mortgage and property space. Recently acquired Trussle and we are taking our platform to the next level. The company values innovation and building high-quality products.

US Unlimited PTO

  • Design and build cloud-native infrastructure for reliability, observability, and automation across GCP, GKE, and Cloud Run.
  • Own incident response, root cause analysis, escalation workflows, and runbooks to prevent hard problems from recurring.
  • Develop Infrastructure as Code, CI/CD pipelines, and operational tooling to improve developer velocity and platform efficiency.

CertifyOS is building the data infrastructure that powers modern healthcare, automating provider licensing, enrollment, credentialing, and network monitoring through an API-first platform. The company is backed by leading investors with a team of deep experience in provider data systems, valuing authenticity, accountability, collaboration, results, and openness to feedback.

$116,000–$128,800/yr
US

  • Own and maintain the reliability, performance, and availability of large-scale production systems.
  • Design, build, and improve CI/CD pipelines using Azure DevOps, GitHub Actions, Jenkins, and Octopus Deploy.
  • Drive cloud cost optimization, scalability, and auto-scaling initiatives across hosted environments.

Encoura empowers students and institutions to create meaningful connections so everyone can make the most informed decisions to achieve their goals. Since 1972, the company has evolved its products and services to better represent the link between students and higher education institutions and to create the highest probability of student success.

6w PTO

  • Design, build, and maintain scalable CI/CD pipelines using GitLab CI/CD
  • Develop and maintain Infrastructure as Code solutions using Terraform and Ansible
  • Build and improve internal developer platform tools and deployment services

Social Discovery Group (SDG) is one of the world's largest groups of social discovery companies, uniting millions of users on dozens of products. Our international team of 1000+ professionals and digital nomads works all over the world.