Source Job

$4,000–$5,000/mo
Latin America

  • Design and evolve production environments, define standards and best practices.
  • Partner with engineering and IT teams to build scalable, reliable systems.
  • Lead incident response practices, and set guardrails around security, reliability, and cost management.

CI/CD Git Azure DevOps Kubernetes Terraform

20 jobs similar to Senior Site Reliability Engineer

Jobs ranked by similarity.

Latin America

  • Design, implement, and manage cloud infrastructure using Infrastructure as Code (IaC) tools.
  • Design, build, and maintain scalable CI/CD pipelines using tools like CircleCI or GitHub Actions.
  • Implement and maintain observability tooling (Prometheus, Grafana, Datadog), and lead incident response to ensure system reliability.

Engine is transforming business travel into something personalized, rewarding, and simple. More than 20,000 companies already rely on Engine to support over 1 million travelers and billions in annual bookings each year.

US

  • Design, build, and maintain secure, scalable cloud infrastructure.
  • Own CI/CD pipelines and deployment workflows across services and environments.
  • Improve reliability, availability, and performance through monitoring, alerting, and incident response practices.

Jobgether is a company that uses an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. They identify the top-fitting candidates and share this short list directly with the hiring company.

$150,000–$167,000/yr
US

  • Lead reliability-focused design and readiness reviews.
  • Build, operate, and continuously improve our observability stack.
  • Own and evolve incident management practices.

Transcend is building the privacy platform that easily embeds privacy into your entire tech stack. They are growing quickly, backed by top-tier investors and are proud to serve some of the world's most iconic brands.

US Canada Europe

  • Design, build, and maintain highly available, scalable infrastructure.
  • Manage and optimize infrastructure across GCP, AWS, Azure, and other cloud providers.
  • Develop comprehensive monitoring, logging, and alerting systems.

Bobsled is seeking a Site Reliability Engineer to enhance its data-sharing platform's reliability and scalability. We're a company that values growth, offering flexible work hours in a fully remote environment and fully sponsored individual coaching for all employees.

US

  • Ensure the smooth operation and high availability of Clarifai's core services
  • Monitor system performance, identify bottlenecks, and implement optimizations to enhance reliability and efficiency
  • Design and implement scalable, secure, and cost-effective infrastructure solutions

Clarifai is a leading AI platform specializing in computer vision and generative AI, empowering organizations to transform unstructured data into actionable insights. Founded in 2013, they have a diverse, globally distributed team with $100M in funding and are committed to building a diverse and inclusive team.

US

  • Designs and maintains CI/CD pipelines using GitLab CI/CD.
  • Implements Infrastructure as Code (IaC) with tools like Terraform.
  • Automates complex workflows and enhances infrastructure scalability.

Everseen is a vision AI solutions provider for global retailers. They have over 900 employees globally, with headquarters in Cork, Ireland, European headquarters in Cork, Ireland, and a U.S. headquarters in Miami, with hubs in Romania, Serbia, India, Australia, and Spain.

Nigeria

  • Design, implement, and manage multi-cloud infrastructure.
  • Implement container orchestration strategies for microservices architectures.
  • Develop and maintain infrastructure as code using Terraform.

Moniepoint is an all-in-one financial services platform for emerging markets, offering personal and business banking, payment, credit, and business management tools. It is also the second-fastest-growing company in Africa with over 3 million users.

$109,800–$252,500/yr
US Unlimited PTO 16w maternity 8w paternity

  • Design, implement, and maintain scalable and reliable infrastructure solutions.
  • Automate deployments and maintain a resilient, secure SaaS application platform.
  • Develop comprehensive monitoring and alerting solutions, and respond to incidents.

Veeam is the #1 global market leader in data resilience, believing businesses should control all their data whenever and wherever they need it, providing data resilience through data backup, data recovery, data portability, data security, and data intelligence. Based in Seattle, Veeam protects over 550,000 customers worldwide who trust Veeam to keep their businesses running.

Global

  • Partner with engineers to build dev tools that empower developer workflows and deployment infrastructure.
  • Ensure reliability of multi-cloud Kubernetes clusters and pipelines.
  • Metrics, logging, analytics, and alerting for performance and security across all endpoints and applications.

Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Their platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices.

US Canada Europe

  • Lead a global team of Site Reliability Engineers.
  • Recruit, hire, onboard and develop engineers.
  • Guide project planning by defining milestones and identifying dependencies.

AuthZed creates and maintains SpiceDB and the authorization infrastructure. They are a Series A company with a fully remote team across the US, Canada, and Europe and a hardworking, close-knit group with a software-driven culture that values integrity, collaboration, and open-mindedness.

Costa Rica

  • Design, implement, and manage CI/CD pipelines to automate the software development lifecycle and perform platform, application deployments using cloud and on-prem services.
  • Collaborate with agile development teams to ensure code quality and reliability.
  • Implement observability using Dynatrace, AWS cloud watch and related tools and monitor and maintain system performance, availability, and security.

Experian is a global data and technology company, powering opportunities for people and businesses around the world. As a FTSE 100 Index company listed on the London Stock Exchange (EXPN), they have a team of 22,500 people across 32 countries.

$126,000–$184,000/yr
US

  • Own the operational stability and performance of Juul’s hybrid cloud infrastructure.
  • Lead automation efforts and architect for reliability.
  • Act as the final escalation point for critical incidents.

Juul Labs aims to transition the world’s billion adult smokers away from combustible cigarettes and eliminate their use, while also combating underage usage of their products. They are backed by leading technology investors and are committed to hiring great talent and building a diverse team.

Global

  • Lead and manage the DevOps team, prioritizing performance and accountability across cloud functions.
  • Define and enforce DevSecOps standards integrating automation, security, and compliance.
  • Optimize cloud infrastructure across AWS, GovCloud, and Azure for uptime and cost-effectiveness.

Jobgether is a company using an AI-powered matching process to ensure applications are reviewed quickly, objectively, and fairly. This allows them to identify the top-fitting candidates for companies, and this shortlist is then shared directly with the hiring company.

$120,000–$150,000/yr
US

  • Design, build, and maintain automated CI/CD pipelines to enable fast, secure, and reliable deployments.
  • Provision, manage, and optimize core AWS services to support scalable, highly available applications.
  • Implement and maintain IaC frameworks to ensure infrastructure is version-controlled, repeatable, and auditable.

Arine is a healthcare technology and clinical services company dedicated to ensuring individuals receive the safest and most effective treatment. They are backed by leading healthcare investors and collaborate with top healthcare organizations, managing more than 18 million lives across prominent health plans.

Latin America

  • Cloud Engineering experience with AWS, GCP, and/or Azure.
  • Designed and maintained CI/CD process and tools.
  • In-depth experience with orchestration and config management tools.

Bluelight is a leading software consultancy dedicated to designing and developing innovative technology that enhances users' lives. They focus on quality and customer satisfaction, fostering a collaborative and enriching work environment where each team member can grow and thrive.

$140,200–$175,200/yr
US

  • Own the entire Laboratory Operations Software release process execution, ensuring smooth and timely software releases with minimal downtime.
  • Act as an internal consultant and subject matter expert, coaching individual product teams on best-in-class DevOps practices.
  • Continuously improve and automate infrastructure provisioning, configuration management, application deployment, and testing using tools like Terraform, Kubernetes and CI/CD.

Natera is a global leader in cell-free DNA (cfDNA) testing, dedicated to oncology, women’s health, and organ health, aiming to make personalized genetic testing standard. The Natera team consists of highly statisticians, geneticists, doctors, laboratory scientists, business professionals, software engineers and many other professionals from world-class institutions, who care deeply for the work and each other.

$110,000–$130,000/yr
US 2w PTO

  • Ensure uptime and performance through monitoring, incident response, and preventive measures.
  • Build and maintain CI/CD pipelines for smooth software releases.
  • Implement security best practices across infrastructure, applications, and data.

ALIS values and promotes diversity. They are an equal opportunity employer.

US

  • Ensure near-zero downtime with monitoring and alerting, self-healing automation, and continuous improvement
  • Create highly automated, available and scalable systems by applying software and infrastructure principles
  • Employ and advise clients on DevOps and SRE principles and practices, covering deployment pipelines, HA, service reliability, technical debt, and operational toil for live services running at scale

66degrees is an AI transformation partner. They guide enterprises from business challenges to quantifiable outcomes, helping businesses reach their inflection point where chaotic data becomes a strategic asset, complexity becomes clarity, and AI becomes an engine for growth. They believe in thriving through challenges and winning together.

US

  • Lead incident response as Incident Commander, coordinating teams, communications, and service restoration
  • Produce executive-level incident reports, run RCAs, and drive continuous improvement
  • Enforce change management and risk assessment for production changes

Truelogic is a leading provider of nearshore staff augmentation services headquartered in New York, delivering top-tier technology solutions to companies of all sizes. Their team of 600+ highly skilled tech professionals, based in Latin America, drives digital disruption by partnering with U.S. companies on their most impactful projects.

Europe South America

  • Design, build, and maintain efficient and reliable software and infrastructure delivery pipelines on AWS
  • Recommend upgrades to services as/when new features on the underlying platform (AWS) are built and functioning
  • Implement and maintain infrastructure as code (IaC) using tools like Terraform

They build and deploy software and infrastructure delivery pipelines. They optimize and maintain production systems and services, set up, monitor and observe key alerts, and balance service reliability with delivery speed.