Source Job

4w PTO

  • Work closely with developers for prototyping, and designing new features as part of the infrastructure.
  • Deploy, install, configure and maintain sophisticated Trading/Finance and related software.
  • Build & maintain CI/CD pipelines.

Linux UNIX Ansible Terraform Docker

20 jobs similar to Site Reliability Engineer (SRE)

Jobs ranked by similarity.

$104,839–$252,194/yr
US

  • Work independently and gather requirements to design and implement CICD with tight collaboration.
  • Implement best practices for software delivery continuous integration and continuous deployment (CICD).
  • Drive Automation efforts across the organization utilizing Infrastructure as Code (IaC).

Mercury has been guided by its purpose to help people reduce risk and overcome unexpected events for more than 60 years. They reward their talented professionals with a competitive salary, bonus potential, and a variety of benefits to help team members reach their health, retirement, and professional goals.

US Unlimited PTO

  • Design, build, and maintain scalable infrastructure and tooling that improves reliability, performance, and availability across OnePay’s platform
  • Contribute to the evolution of our observability stack, platform libraries, cloud architecture, and CI/CD pipelines
  • Develop automation and monitoring systems to detect, prevent, and remediate incidents before they impact customers

OnePay is a consumer fintech company trusted by millions of Americans to make money better, providing an all-in-one financial services platform. Backed by Walmart and Ribbit Capital, OnePay provides banking, savings, credit cards, lending, investing, and crypto services and embedded financial services to frontline workers.

$160,000–$200,000/yr
US

  • Help drive reliability, automation and performance within our cloud-based infrastructure.
  • Become embedded within an Engineering team helping them navigate production excellence and advocate for best practices.
  • Debug production issues across services and levels of the stack as well as practice incident response and blameless postmortems.

Flywire is a global payments enablement and software company that was founded over a decade ago. They have over 1,200 global FlyMates, representing more than 40 nationalities, in 12 offices worldwide, and are looking for people to join the next stage of their journey as they continue to grow.

Canada

  • Designing and implementing SLI/SLO frameworks with error budgets to guide reliability and performance decisions.
  • Building and maintaining AWS-based production infrastructure using Infrastructure as Code (Terraform, CloudFormation), including ECS, EKS/Kubernetes, and microservices orchestration.
  • Developing internal tools, automation frameworks, and reliability services in TypeScript, Python, or similar languages to enhance operational efficiency.

Jobgether uses an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. They identify the top-fitting candidates, and this shortlist is then shared directly with the hiring company.

Unlimited PTO

  • Build and operate cutting-edge cloud infrastructure to support Diagrid's core products
  • Define standards, deliver tools, processes, and frameworks to make our products secure, reliable, efficient, and highly available
  • Build and maintain CI/CD pipelines that enable delivering software quickly and securely across clouds

Diagrid believes that open-source software, open standards and APIs are the greatest transformational tools for organizations. They provide developers with APIs and tools that help them focus on their code and not on infrastructure and are founded by the creators of the Dapr and KEDA open-source projects.

$141,000–$230,000/yr
US

  • Collaborate with engineering teams to design and implement scalable, secure systems.
  • Establish and manage service level objectives (SLOs) and service level agreements (SLAs).
  • Enhance incident response processes and post-mortem analysis for outages.

ClickHouse, recognized on the 2025 Forbes Cloud 100 list, is one of the most innovative and fast-growing private cloud companies. With more than 3,000 customers and ARR that has grown over 250 percent year over year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads.

US

  • Collaborate with application engineering teams on platform infrastructure.
  • Enhance observability and spearhead the adoption of SRE best practices.
  • Build and maintain reliable CI/CD pipelines, tooling, and infrastructure.

Rula strives to provide quality, evidence-based, compassionate mental healthcare and aims to create a world where mental health is no longer stigmatized. They are a remote-first company operating in most U.S. states, and are dedicated to having a culture of inclusion that supports their employees.

$120,000–$180,000/yr
US

  • Develop automation code to provision and operate infrastructure at scale.
  • Build resilient, scalable, secure, and observable services with cost optimization.
  • Proactively identify and address security concerns across systems and infrastructure.

Globality uses AI to transform enterprise spending into a more efficient and inclusive process. They aim to revolutionize enterprise procurement with AI and have a culture built on trust, collaboration, and innovation, fostering an environment where every individual feels valued and included.

$172,614–$172,614/yr
US

  • Design infrastructure, networking, and software platform architecture.
  • Build and maintain automation of Continuous Integration and Continuous Deployment pipelines.
  • Troubleshoot infrastructure, internal applications, networking, and security issues.

Loadsmart is a technology company focused on the logistics and supply chain industry. They leverage data and technology to automate and optimize freight transportation, connecting shippers and carriers to streamline the shipping process. They are a mid-sized company passionate about transforming the future of freight.

US

  • Architect and deploy on-premise and cloud-based Linux infrastructure.
  • Develop and maintain Infrastructure-as-Code (IaC) frameworks using Terraform and Ansible.
  • Implement system-level security best practices including patching and hardening.

Jobgether uses an AI-powered matching process to ensure applications are reviewed quickly, objectively, and fairly against the role's core requirements. They identify the top-fitting candidates, and this shortlist is then shared directly with the hiring company.

$110,000–$175,000/yr
US

  • Become a subject matter expert in applications supporting Ooma customers.
  • Collaborate with Development, QA and other SREs to evaluate, deploy, and debug applications.
  • Improve observability by implementing, refining, and adjusting application monitoring and thresholds.

Ooma empowers people to connect in smarter ways by creating powerful communication experiences through their cloud-based platform. They help small business owners stay connected, provide customized unified communications solutions, and offer smart home security solutions.

$140,000–$180,000/yr
Americas Unlimited PTO 16w maternity

  • Build and scale infrastructure to support billions of messages per day and real-time events
  • Automate deployments, alerting, and incident response
  • Tune MySQL and other datastore performance and improve reliability across distributed systems

Customer.io's platform enables over 8,000 companies, from scrappy startups to global brands, to send billions of automated emails, push notifications, in-app messages, and SMS every day. They foster a culture that values empathy, transparency, and responsibility.

South America

  • Own the end‑to‑end lifecycle of core platform components, including cloud infrastructure primitives and Kubernetes clusters.
  • Design platform components to be resilient by default, applying SRE principles like fault isolation and capacity planning.
  • Drive Infrastructure‑as‑Code and GitOps‑first practices to ensure platform components are reproducible and auditable.

Pismo, founded in 2016, provides a comprehensive processing platform for banking, card issuing, and financial market infrastructure, helping customers innovate in banking and payments. With over 500 employees across 10+ countries, Pismo joined Visa in 2024, leveraging Visa’s solutions to advance financial technology.

$230,000–$250,000/yr
US Unlimited PTO 12w paternity

  • Define and evolve reliability standards for the SmarterDx platform.
  • Enhance observability systems (metrics, logs, traces, alerting) to provide actionable insights and reduce mean time to detect (MTTD) and resolve (MTTR).
  • Reduce operational toil through automation, self-healing systems, and improved deployment and rollback mechanisms.

SmarterDx, a Smarter Technologies company, builds clinical AI that is transforming how hospitals translate care into payment. Founded by physicians in 2020, their platform connects clinical context with revenue intelligence, helping health systems recover millions in missed revenue, improve quality scores, and appeal every denial.

Europe

  • Implement SLI/SLO frameworks with error budgets to drive reliability decisions
  • Design release strategies including blue/green deployments and version tracking
  • Lead incident response and develop automated runbooks to reduce MTTR

Jobgether is a company that helps connect individuals with jobs through an AI-powered matching process. They ensure applications are reviewed quickly, objectively, and fairly against roles' core requirements.

  • Maximize the velocity of our product engineering team.
  • Ensure platform scalability, reliability, and security.
  • Champion best practices and shape the engineering culture.

They are building a robust, scalable trading platform to serve high-traffic, latency-sensitive applications. They leverage state-of-the-art technologies to support real-time trading while providing unparalleled reliability and performance.

Europe

  • Work closely with developers and operations teams to scale and optimize their infrastructure for sustained growth.
  • Design, deploy, and operate their core backend infrastructure using automated, Infrastructure-as-Code approach.
  • Prioritize and own delivery in a small, highly efficient team — you set the bar, not just maintain it.

Relai is Europe's fastest growing Bitcoin-only app. They are looking for an experienced, results-oriented and impact-driven Senior DevOps Engineer who can help them scale their infrastructure and pursue their mission of bringing the best store of value to more people.

$106,500–$202,500/yr
US

  • Architect new and existing systems to enhance performance, reliability, and scalability.
  • Build, implement, iterate over CI/CD pipelines.
  • Assist with the Management, Development, Design, and Deployment of microservice and containerized applications.

AbbVie's mission is to discover and deliver innovative medicines and solutions that solve serious health issues today and address the medical challenges of tomorrow. They strive to have a remarkable impact on people's lives across several key therapeutic areas.

US

  • Responsible for availability, latency, performance, efficiency, monitoring/observability, emergency response, capacity planning.
  • Analyze, troubleshoot and resolve operational challenges contributing to defined SLO's.
  • Manage site stability, performance, reliability, and maintain uptime for production environments.

CentralReach provides autism and IDD care software for Applied Behavior Analysis (ABA), multidisciplinary therapy, and special education. They are trusted by more than 200,000 users and is backed by Roper Technologies, Inc. (Nasdaq: ROP). Their culture is centered around impact, inclusion, and flexibility.

$146,200–$212,000/yr
US Unlimited PTO

  • Collaborate with service engineering teams to design, implement, and maintain scalable and resilient infrastructure solutions.
  • Implement SRE principles to improve system reliability and reduce downtime.
  • Improve developer workflows by creating self-service tools, optimizing CI/CD pipelines, and enhancing deployment processes.

Flex is a growth-stage FinTech company creating the best rent payment experience. They empower renters with flexibility over their most significant recurring expense and are growing quickly with a focus on building an inclusive culture.