Source Job

$172,614–$172,614/yr
US

  • Design infrastructure, networking, and software platform architecture.
  • Build and maintain automation of Continuous Integration and Continuous Deployment pipelines.
  • Troubleshoot infrastructure, internal applications, networking, and security issues.

Linux Python Go JavaScript

20 jobs similar to Senior Site Reliability Engineer

Jobs ranked by similarity.

US Unlimited PTO

  • Design, build, and maintain scalable infrastructure and tooling that improves reliability, performance, and availability across OnePay’s platform
  • Contribute to the evolution of our observability stack, platform libraries, cloud architecture, and CI/CD pipelines
  • Develop automation and monitoring systems to detect, prevent, and remediate incidents before they impact customers

OnePay is a consumer fintech company trusted by millions of Americans to make money better, providing an all-in-one financial services platform. Backed by Walmart and Ribbit Capital, OnePay provides banking, savings, credit cards, lending, investing, and crypto services and embedded financial services to frontline workers.

$141,000–$230,000/yr
US

  • Collaborate with engineering teams to design and implement scalable, secure systems.
  • Establish and manage service level objectives (SLOs) and service level agreements (SLAs).
  • Enhance incident response processes and post-mortem analysis for outages.

ClickHouse, recognized on the 2025 Forbes Cloud 100 list, is one of the most innovative and fast-growing private cloud companies. With more than 3,000 customers and ARR that has grown over 250 percent year over year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads.

Nigeria

  • Detect and triage service and reliability issues.
  • Develop automation to eliminate manual and repetitive operational tasks.
  • Investigate and resolve customer complaints escalated beyond L1 and L2 support.

Moniepoint is an all-in-one financial services platform for emerging markets. Since 2019, Moniepoint’s technology has powered over 3 million people, offering personal and business banking, payment, credit and business management tools to help them succeed.

Unlimited PTO

  • Build and operate cutting-edge cloud infrastructure to support Diagrid's core products
  • Define standards, deliver tools, processes, and frameworks to make our products secure, reliable, efficient, and highly available
  • Build and maintain CI/CD pipelines that enable delivering software quickly and securely across clouds

Diagrid believes that open-source software, open standards and APIs are the greatest transformational tools for organizations. They provide developers with APIs and tools that help them focus on their code and not on infrastructure and are founded by the creators of the Dapr and KEDA open-source projects.

$120,000–$140,000/yr
US Unlimited PTO

  • Architect and manage scalable cloud infrastructure within AWS.
  • Implement and maintain infrastructure using Terraform.
  • Develop automation scripts to improve operational efficiency.

Attune empowers insurance agents with their technology solutions. We foster a remote-first culture and value employee development.

$205,000–$270,000/yr
US Unlimited PTO

  • Partner with engineers to build dev tools that empower developer workflows and deployment infrastructure.
  • Ensure reliability of multi-cloud Kubernetes clusters and pipelines.
  • Focus on automation so we can spend energy where it matters.

Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Their platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices.

$120,000–$180,000/yr
US

  • Develop automation code to provision and operate infrastructure at scale.
  • Build resilient, scalable, secure, and observable services with cost optimization.
  • Proactively identify and address security concerns across systems and infrastructure.

Globality uses AI to transform enterprise spending into a more efficient and inclusive process. They aim to revolutionize enterprise procurement with AI and have a culture built on trust, collaboration, and innovation, fostering an environment where every individual feels valued and included.

$118,000–$158,000/yr
US

  • Monitor and troubleshoot system performance, reliability, and availability issues using modern observability tools and techniques.
  • Design, implement, and maintain scalable and reliable infrastructure using containers, Kubernetes, and microservices architecture.
  • Manage CI/CD pipelines to facilitate efficient software development and deployment processes.

Ooma empowers people to connect in smarter ways by creating powerful communication experiences through our cloud-based platform to bring people together at work and at home. Ooma's solutions help small business owners stay connected with their customers and manage their businesses from anywhere.

US

  • Architect and deploy on-premise and cloud-based Linux infrastructure.
  • Develop and maintain Infrastructure-as-Code (IaC) frameworks using Terraform and Ansible.
  • Implement system-level security best practices including patching and hardening.

Jobgether uses an AI-powered matching process to ensure applications are reviewed quickly, objectively, and fairly against the role's core requirements. They identify the top-fitting candidates, and this shortlist is then shared directly with the hiring company.

  • Maximize the velocity of our product engineering team.
  • Ensure platform scalability, reliability, and security.
  • Champion best practices and shape the engineering culture.

They are building a robust, scalable trading platform to serve high-traffic, latency-sensitive applications. They leverage state-of-the-art technologies to support real-time trading while providing unparalleled reliability and performance.

LATAM

  • Monitor production systems, dashboards, logs, and alerts to ensure high availability and performance across distributed environments.
  • Assist in incident detection, triage, escalation, and resolution, following structured on-call rotations with mentorship support.
  • Maintain, follow, and continuously improve runbooks, operational procedures, and incident response workflows.

Jobgether is a platform that helps job seekers find the right opportunities. They use an AI-powered matching process to ensure applications are reviewed quickly and fairly.

$106,500–$202,500/yr
US

  • Architect new and existing systems to enhance performance, reliability, and scalability.
  • Build, implement, iterate over CI/CD pipelines.
  • Assist with the Management, Development, Design, and Deployment of microservice and containerized applications.

AbbVie's mission is to discover and deliver innovative medicines and solutions that solve serious health issues today and address the medical challenges of tomorrow. They strive to have a remarkable impact on people's lives across several key therapeutic areas.

$165,000–$200,000/yr
US Unlimited PTO

  • Contribute to building and operating the infrastructure that supports the HackerOne platform.
  • Improve the reliability, security, and scalability of our systems.
  • Design and operate highly available cloud systems and apply best practices for reliability, observability, and security.

HackerOne is a global leader in Continuous Threat Exposure Management (CTEM). The HackerOne Platform unites agentic AI solutions with the ingenuity of the world’s largest community of security researchers to continuously discover, validate, prioritize, and remediate exposures across code, cloud, and AI systems. They combine the ingenuity of the largest security research community with a best-in-class AI-powered platform, trusted by the world’s top organizations.

Global

  • Own the end-to-end lifecycle (design, provisioning, upgrades, and decommissioning) of core platform components.
  • Lead the design and implementation of infrastructure bootstrap orchestration, including: Automated cluster and environment provisioning.
  • Apply and promote SRE practices across the platform, including: Clear ownership and runbooks for platform components.

Pismo provides a comprehensive processing platform for banking, card issuing and financial market infrastructure and helps customers innovate and build the next generation of banking and payment solutions. Pismo’s 500+ employees are located in more than 10 countries around the world.

Global

  • Maintain and continuously improve production uptime, supporting our ≥99.9% target for 2026.
  • Monitor systems proactively and respond effectively to production incidents.
  • Drive improvements in MTTR (Mean Time to Resolution).

Infiterra's B2B SaaS platform simplifies subscription service delivery, helping IT Distributors and Managed Service Providers (MSPs) automate and grow their subscription business. With 100+ customers in 75 countries, Infiterra is known for its collaborative and growth-oriented culture.

Europe

  • Design and maintain scalable, fault-tolerant infrastructure that supports our SaaS platform and keeps pace with business growth.
  • Define, document, and maintain SLIs, SLOs, and SLAs in partnership with product engineering, translating business commitments into technical guardrails.
  • Lead incident response with steady judgment, facilitate blameless postmortems, and drive remediation efforts that prevent recurrence.

Fixify is on a mission to reimagine IT teams support companies. They need a Senior Site Reliability Engineer who finds joy in building systems that fade into the background, empowering product engineers to ship with confidence and their customers to work without interruption.

Canada

  • Designing and implementing SLI/SLO frameworks with error budgets to guide reliability and performance decisions.
  • Building and maintaining AWS-based production infrastructure using Infrastructure as Code (Terraform, CloudFormation), including ECS, EKS/Kubernetes, and microservices orchestration.
  • Developing internal tools, automation frameworks, and reliability services in TypeScript, Python, or similar languages to enhance operational efficiency.

Jobgether uses an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. They identify the top-fitting candidates, and this shortlist is then shared directly with the hiring company.

$156,000–$211,000/yr
US Canada

  • Own and deliver infrastructure projects end-to-end.
  • Build and improve platform primitives for service teams.
  • Improve observability and implement cost and performance improvements.

Afresh is the leading AI company in fresh food, partnering with grocers to order fresh food. They've experienced record-breaking growth and are on a mission to eliminate food waste. They have over 148 million in funding and embody values of proactivity, kindness, candor, and humility.

US

  • Help deploy and configure Dynatrace OneAgent and ActiveGates with automated tooling.
  • Define and instrument user‑centric metrics and objectives in Dynatrace.
  • Combine Davis® AI with Copilot/Claude to identify root causes and reduce MTTR.

AWP Safety's IT Internship Program is a hands‑on, learning experience for early‑career professionals who want to build a future in IT Site Reliability Engineering. They operate at the intersection of Software Engineering and Systems Operations, using Dynatrace to diagnose performance bottlenecks and automate "toil" out of existence.

South America

  • Own the end‑to‑end lifecycle of core platform components, including cloud infrastructure primitives and Kubernetes clusters.
  • Design platform components to be resilient by default, applying SRE principles like fault isolation and capacity planning.
  • Drive Infrastructure‑as‑Code and GitOps‑first practices to ensure platform components are reproducible and auditable.

Pismo, founded in 2016, provides a comprehensive processing platform for banking, card issuing, and financial market infrastructure, helping customers innovate in banking and payments. With over 500 employees across 10+ countries, Pismo joined Visa in 2024, leveraging Visa’s solutions to advance financial technology.