Jobs Similar to Site Reliability Engineer (SRE)

Senior Site Reliability Engineer

Circle 10 days ago

Americas 7w PTO

Act as a first responder for system incidents and outages, ensuring high availability and performance.
Own and evolve monitoring, alerting, and log management systems while optimizing database infrastructure.
Collaborate with engineering teams to build scalable, resilient systems and contribute to SRE tooling and automation.

Circle is building the world's leading all-in-one platform for online communities. We're a fully remote company of around 200 team members from 30+ countries, with a culture that values autonomy, async collaboration, and high expectations.

View details Similar jobs

Site Reliability Engineer (SRE)

Synthesia 21 hours ago

US

Take ownership of incident management and operational excellence across cloud infrastructure.
Automate high-risk manual processes and drive reliability gains through engineering.
Own a platform domain such as Temporal, observability, or Kubernetes operations.

Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London with offices across Europe and the US, and has over $530 million in funding from premier investors like Accel and Nvidia's VC arm.

View details Similar jobs

Site Reliability Engineer (E3)

Vynca 4 days ago

US

Design, provision, and manage AWS infrastructure using Terraform and Kubernetes.
Build, operate, and improve observability, monitoring, and incident response processes.
Collaborate with engineering teams on capacity planning, performance optimization, and resilient system design.

Vynca provides comprehensive care for individuals with complex needs, focusing on quality days at home. The company is a close-knit community guided by core values of Excellence, Compassion, Curiosity, and Integrity.

View details Similar jobs

Staff Site Reliability Engineer - Site Experience

Reddit 25 days ago

Europe

Lead Reliability Engineering for User Experience.
Architect for Scale, partnering with product and infrastructure teams to design highly available systems.
Drive Automation to eliminate repetitive operational work through tooling and systems.

Reddit is a community-based platform where users submit, vote, and comment on various topics. It hosts over 100,000 active communities and attracts millions of daily active users, making it one of the largest and most influential internet platforms.

View details Similar jobs

Senior SRE, Ads

Reddit 24 hours ago

UK Netherlands Ireland Unlimited PTO

Partner with Ads Engineering teams to improve reliability, scalability, and operational excellence of ad-serving and related systems.
Design, build, and maintain infrastructure, tooling, and automation to improve service reliability and engineering productivity.
Participate in on-call rotations, lead incident response, and drive root cause analysis and corrective actions.

Reddit is a community of communities built on shared interests, passion, and trust. With 100,000+ active communities and approximately 126 million daily active unique visitors, it is one of the internet's largest sources of information.

View details Similar jobs

Manager Site Reliability Operations

Mercury Insurance 8 days ago

US

Lead the Site Reliability Operations team, overseeing observability, monitoring, incident response, and operational excellence for key enterprise services.
Partner with product, engineering, and infrastructure teams to embed CI/CD and release best practices, automating build/test/deploy and release monitoring.
Own problem management, driving root cause analysis and corrective actions to improve system resilience and reduce incident impact.

Mercury Insurance helps people reduce risk and overcome unexpected events, serving customers for over 60 years. They are a midsize employer recognized as one of America's Best Midsize Employers for 2026, with a collaborative culture focused on growth and inclusion.

View details Similar jobs

Staff SRE, Ads

Reddit 19 hours ago

Europe

Lead reliability initiatives across multiple Ads domains including ad serving, auctions, targeting, reporting, measurement, and billing.
Partner with engineering leadership to improve reliability, scalability, operational excellence, and engineering efficiency across the Ads organization.
Design and build platforms, tooling, and automation that improve reliability and developer productivity at scale.

Reddit is a community of communities, built on shared interests, passion, and trust, home to the most open and authentic conversations on the internet. With 100,000+ active communities and approximately 126 million daily active unique visitors, it is one of the internet's largest sources of information.

View details Similar jobs

Sr. Site Reliability Engineer

Versant 1 day ago

US

Lead design and operation of internal developer platforms and self-service infrastructure.
Build and optimize CI/CD pipelines, deployment workflows, and automation across GitHub Actions, Jenkins, ArgoCD.
Apply SRE principles to improve developer-facing systems and software delivery performance.

Versant is a media company owning iconic brands in news, sports, and entertainment, including USA Network, Fandango, and Rotten Tomatoes. It is an independent, publicly traded company with a collaborative, inclusive culture and a remote-first work environment.

View details Similar jobs

Senior Site Reliability Engineer (Remote Build)

Remote 5 days ago

Global Unlimited PTO 16w maternity 16w paternity

Own the operational excellence and infrastructure strategy for Remote Build's platform, ensuring reliability, performance, and security.
Lead incident response, build observability systems, and drive continuous improvement in system reliability.
Embed security into infrastructure, optimize costs, and automate operational toil to scale efficiently.

Remote solves modern organizations' biggest challenge of navigating global employment compliantly. With a fully distributed team across 6 continents, the company fosters a future-focused culture with core values of innovation and async work.

View details Similar jobs

Sr Site Reliability Engineer

Jobgether 9 hours ago

US

Ensure reliability, availability, and observability for a large-scale cloud-based SaaS platform serving millions in education.
Design and maintain infrastructure-as-code and CI/CD pipelines while leading incident response and resolution.
Mentor peers and integrate AI-driven tools to improve SRE workflows and system performance.

Jobgether is an AI-powered job matching platform that connects candidates with hiring companies. The company manages the application process and uses AI to shortlist top-fitting candidates based on core requirements.

View details Similar jobs

Staff Engineer, Site Reliability

Babylist 9 days ago

US Canada

Own and evolve AWS infrastructure using Terraform, managing EKS clusters, databases, and core services.
Maintain CI/CD reliability and developer tooling across the full engineering org.
Lead incident response, drive post-incident reviews, and improve monitoring and alerting standards.

Babylist is the leading platform for expecting and new families, helping parents feel confident, connected, and cared for at every step. As a modern, AI-forward tech company with over 10 million yearly shoppers, Babylist has expanded into a full ecosystem and generated $750M in revenue in 2025, reshaping the $235B kids and baby market.

View details Similar jobs

Senior/Staff Platform Engineer

VRChat 30 days ago

Global Unlimited PTO

Improve the reliability, performance, and scalability of our production platform.
Operate reliable infrastructure, improve observability, and drive incident response.
Use data-driven reliability practices such as SLIs, SLOs, SLAs, and DORA metrics.

VRChat is a game-changing platform that provides an endless collection of social VR experiences. They empower their community to bring their imaginations to life and help shape the metaverse. Their team includes people from Netflix, Twitter, Meta, and Microsoft.

View details Similar jobs

Site Reliability Engineer II

Openly 25 days ago

$115,200–$172,800/yr

US 8w paternity

Build internal tooling to help other engineers and the rest of the company understand and operate our system.
Design and implement security best practices for our team and infrastructure.
Reduce toil through automation, including building and maintaining CI/CD infrastructure.

Openly is rebuilding insurance from the ground up by re-envisioning and enhancing every aspect of the customer experience. They are a rapidly growing team of exceptional, curious, empathetic people with a wide range of skill sets, spanning many departments.

View details Similar jobs

Site Reliability Engineer (SRE)

Altera Digital Health 10 days ago

US

Ensure reliability, scalability, and performance of hosted healthcare platforms.
Lead incident response, root cause analysis, and implement proactive monitoring.
Automate operational tasks using scripting and Infrastructure-as-Code.

Altera Digital Health empowers healthcare providers to deliver superior care through innovative technology. The company is part of Constellation Software Inc., Canada's largest software company, offering a supportive and award-winning culture with opportunities for growth.

View details Similar jobs

Senior Site Reliability Engineer II - Infrastructure (AI Native)

Life360 5 days ago

Canada

Build and maintain infrastructure platforms for over 200 backend services running on Kubernetes clusters with 40,000+ cores.
Lead and mentor other engineers, own complex infrastructure failures, and participate in a shared on-call rotation.
Drive cloud cost efficiency, estimate schedules, and use AI tools as a first-class collaborator in daily workflows.

Life360's mission is to keep people close to the ones they love through location sharing, safe driver reports, and crash detection. The company serves approximately 97.8 million monthly active users across more than 180 countries and has more than 500 remote-first employees.

View details Similar jobs

Engineering Manager, CloudOps Infrastructure

Jamloop 4 days ago

US

Manage a scrum team of 4-6 engineers building and operating high-volume bidder systems.
Oversee AWS-based cloud infrastructure processing over 1 billion HTTP requests per hour.
Drive improvements in reliability, performance, and cost efficiency across production systems.

Jamloop builds high-scale advertising technology for real-time bidding systems. We are a remote-first company focused on reliability and operational excellence.

View details Similar jobs

Sr. Production Engineer

Zscaler 4 days ago

US

Implement highly available, scalable infrastructure across AWS, GCP, and bare-metal environments.
Drive an "automation-first" culture by writing code in Python/Go to build self-healing systems.
Act as lead Incident Commander, develop response playbooks, and conduct post-incident analyses.

Zscaler accelerates digital transformation to secure customers with a cloud-native Zero Trust Exchange platform. The company processes over 200 billion transactions daily and fosters a culture of execution, collaboration, and accountability.

View details Similar jobs

Senior Site Reliability Engineer

Loadsmart 30 days ago

Brazil Unlimited PTO

Collaborate with a tight-knit development team.
Design, deploy, and operate critical systems balancing reliability, cost, and agility.
Perform troubleshooting and root-cause analysis of system operation issues.

Loadsmart is a logistics technology company valued at over $1 billion. We are a collection of industry veterans and user-centered engineers using innovative technology to fearlessly reinvent the future of freight.

View details Similar jobs

Senior Database Reliability Engineer

Scribe 11 days ago

Unlimited PTO

Own database reliability across Aurora, OpenSearch, Redis, and CDC pipeline, including schema design reviews, migration safety, and incident response.
Make the Django ORM a strength at scale by catching N+1 patterns, extending QuerySet conventions, and building CI checks that encode standards.
Build self-service tooling and dashboards giving teams visibility into their query footprint, and contribute to onboarding and knowledge-sharing as the engineering org grows.

Scribe provides a Workflow AI platform that automatically captures and optimizes how work gets done, used by 94% of the Fortune 500. The company has grown to over 5 million daily active users across 600,000 businesses, achieved $100M ARR in May 2026, is Series C valued at $1.3 billion, and fosters a builder culture with a high bar and fast pace.

View details Similar jobs

Senior DevOps Engineer

Jobgether 12 days ago

Canada

Own and operate production cloud environments, ensuring high availability, reliability, and performance across distributed systems.
Design, build, and maintain scalable infrastructure using automation-first principles and Infrastructure as Code practices.
Drive automation initiatives and continuous improvement across infrastructure, deployment, and operational workflows.

Jobgether is an AI-powered job matching platform that connects candidates with hiring companies. They have an inclusive, employee-driven culture with a strong focus on collaboration and innovation.

View details Similar jobs

Source Job