Jobs Similar to Manager Site Reliability Operations

Technology Operations Manager

Business Wire 21 days ago

$200,000–$225,000/yr

Lead the evaluation, adoption, and execution of technology initiatives.
Recruit, mentor, and motivate a high-performance operations staff.
Drive operational excellence through structured incident, problem, and change management practices.

Business Wire is a press release distribution company. The company's total rewards include remote work, health benefits, fitness allotment, and a 401(k) plan.

View details Similar jobs

Senior Site Reliability Engineer

EarnIn 30 days ago

Mexico

Design systems with resilience, graceful degradation, and capacity in mind.
Define and measure SLOs and SLIs that actually reflect what our customers feel.
Use Datadog (logging, metrics, APM) together with CloudWatch to build signal-heavy, noise-light observability.

EarnIn is building products that deliver real-time financial flexibility for those with the unique needs of living paycheck to paycheck. They are growing fast and are excited to continue bringing world-class talent onboard to help shape the next chapter of their growth journey.

View details Similar jobs

Senior Site Reliability Engineer

MZLA Technologies Corporation 4 days ago

US 5w PTO

Design and develop CI/CD systems for websites, services, and release workflows, and operate an EKS-based Kubernetes platform.
Diagnose debug production incidents, drive root-cause analysis, and implement improvements to enhance system reliability.
Write and maintain infrastructure as code using Pulumi or Terraform/OpenTofu across multiple AWS accounts with security-conscious practices.

Thunderbird is one of the world’s most trusted open-source email applications, empowering more than 20 million people globally. Our small but growing distributed team includes 65+ people across seven countries, and we build privacy-respecting communication tools with a collaborative, inclusive, and user-first spirit.

View details Similar jobs

Staff Site Reliability Engineer I EMEA

Remote 25 days ago

$188,550–$212,150/yr

Global Unlimited PTO

Own the technical direction of Remote's SRE/Platform domain.
Define and drive the reliability strategy across the platform.
Identify and lead AI enablement initiatives across the engineering organisation.

Remote is solving modern organizations’ biggest challenge – navigating global employment compliantly with ease. With our core values at heart and a future-focused work culture, our team works tirelessly on ambitious problems, asynchronously, around the world.

View details Similar jobs

Site Reliability Engineer

SupplyHouse.com 27 days ago

$29,000–$36,000/yr

India

Design, build, and maintain scalable, reliable systems on GCP.
Develop automation for infrastructure provisioning using Terraform, Ansible, or Deployment Manager.
Manage incident response, conduct postmortems, and implement improvements to reduce recurrence.

SupplyHouse.com is an industry-leading e-commerce company specializing in HVAC, plumbing, heating, and electrical supplies since 2004. They value every individual team member and cultivate a community where people come first with Generosity, Respect, Innovation, Teamwork, and GRIT.

View details Similar jobs

Site Reliability Engineer (SRE)

Altera Digital Health 2 days ago

US

Ensure reliability, scalability, and performance of hosted healthcare platforms.
Lead incident response, root cause analysis, and implement proactive monitoring.
Automate operational tasks using scripting and Infrastructure-as-Code.

Altera Digital Health empowers healthcare providers to deliver superior care through innovative technology. The company is part of Constellation Software Inc., Canada's largest software company, offering a supportive and award-winning culture with opportunities for growth.

View details Similar jobs

Senior Site Reliability Engineer

Redcare Pharmacy 14 days ago

Germany

Build and maintain end-to-end observability with ELK, Prometheus, and Grafana.
Own and improve CI/CD pipelines (CircleCI, GitLab CI, GitHub Actions, ArgoCD).
Lead incident response and postmortems in a blameless culture.

Redcare Pharmacy is Europe’s No.1 e-pharmacy, powered by passionate teams and cutting-edge innovation. They strive to create a healthy, collaborative work environment where every employee feels valued and inspired to contribute to their vision “Until every human has their health”.

View details Similar jobs

Senior AIOps Engineer, Incident Response

Quanata 15 days ago

$215,000–$280,000/yr

US 4w PTO 12w maternity 12w paternity

Own production health, reliability, and operational support processes across critical systems and services
Lead incident response efforts, stakeholder communication, root cause analysis, and post-incident reviews
Design and implement AI-driven agents and workflows that automate support and operational tasks

Quanata is on a mission to help ensure a better world through context-based insurance solutions. They are an exceptional, customer centered team with a passion for creating innovative technologies, digital products, and brands. Quanata, LLC is wholly owned and funded by State Farm.

View details Similar jobs

Engineering Manager, Core Platform

Vanta 15 days ago

Canada US 4w PTO

Lead and grow high-performing platform engineering teams that deliver reliable, scalable infrastructure and operational excellence for Vanta’s products and customers.
Set technical direction and drive multi-quarter platform initiatives spanning infrastructure reliability, security, scalability, and developer experience across shared systems and services.
Partner closely with product engineering, security, and engineering leadership to identify organizational needs and deliver scalable platform solutions.

Vanta helps businesses earn and prove trust by empowering companies to practice better security and prove it with ease. They have a kind and talented team, and while some have prior security experience, many have been successful without it.

View details Similar jobs

Staff Engineer, Site Reliability

Babylist 1 day ago

US Canada

Own and evolve AWS infrastructure using Terraform, managing EKS clusters, databases, and core services.
Maintain CI/CD reliability and developer tooling across the full engineering org.
Lead incident response, drive post-incident reviews, and improve monitoring and alerting standards.

Babylist is the leading platform for expecting and new families, helping parents feel confident, connected, and cared for at every step. As a modern, AI-forward tech company with over 10 million yearly shoppers, Babylist has expanded into a full ecosystem and generated $750M in revenue in 2025, reshaping the $235B kids and baby market.

View details Similar jobs

Senior Site Reliability Engineer

Circle 2 days ago

Americas 7w PTO

Act as a first responder for system incidents and outages, ensuring high availability and performance.
Own and evolve monitoring, alerting, and log management systems while optimizing database infrastructure.
Collaborate with engineering teams to build scalable, resilient systems and contribute to SRE tooling and automation.

Circle is building the world's leading all-in-one platform for online communities. We're a fully remote company of around 200 team members from 30+ countries, with a culture that values autonomy, async collaboration, and high expectations.

View details Similar jobs

Staff Site Reliability Engineer - Site Experience

Reddit 18 days ago

Europe

Lead Reliability Engineering for User Experience.
Architect for Scale, partnering with product and infrastructure teams to design highly available systems.
Drive Automation to eliminate repetitive operational work through tooling and systems.

Reddit is a community-based platform where users submit, vote, and comment on various topics. It hosts over 100,000 active communities and attracts millions of daily active users, making it one of the largest and most influential internet platforms.

View details Similar jobs

Lead SRE/DevOps Engineer

Launch Potato 28 days ago

$160,000–$190,000/yr

US

Own and evolve Launch Potato's cloud infrastructure, CI/CD platform, and compliance posture.
Build the SRE function from the ground up so product teams can ship faster without compromising reliability, security, or cost control.
Stand up the SRE practice from scratch: on-call rotation, PagerDuty configuration, SLA/SLO definitions for core infrastructure services, runbook library, and observability dashboards that tie site performance to business metrics.

Launch Potato is a digital media company that connects consumers with leading brands through data-driven content and technology. They are headquartered in South Florida with a remote-first team spanning over 15 countries, with a high-growth, high-performance culture.

View details Similar jobs

Senior DevOps Engineer

Jobgether 4 days ago

Canada

Own and operate production cloud environments, ensuring high availability, reliability, and performance across distributed systems.
Design, build, and maintain scalable infrastructure using automation-first principles and Infrastructure as Code practices.
Drive automation initiatives and continuous improvement across infrastructure, deployment, and operational workflows.

Jobgether is an AI-powered job matching platform that connects candidates with hiring companies. They have an inclusive, employee-driven culture with a strong focus on collaboration and innovation.

View details Similar jobs

Senior Site Reliability Engineer, Infrastructure Foundations

Wikimedia Foundation 30 days ago

$113,082–$175,725/yr

US Global

Performing day-to-day operational/DevOps tasks on Wikimedia’s public facing infrastructure.
Implementing and utilizing configuration management and deployment tools.
Leading continuous improvement, by automating the installation, configuration and maintenance of services on our platform.

The Wikimedia Foundation operates Wikipedia and other Wikimedia free knowledge projects with the vision of a world where every single human can freely share in the sum of all knowledge. As a charitable, not-for-profit organization, it relies on donations and has staff members based in 40+ countries.

View details Similar jobs

Senior Developer Experience Engineer

Huntress 4 days ago

US 12w maternity 12w paternity

Design and build tools and frameworks to automate operational tasks and deployments for Portal and Endpoint Agents.
Evolve AI tooling and workflows to enhance developer productivity and integrate AI into daily development.
Build and maintain CI/CD pipelines, support product teams, and optimize software architecture for scalability and reliability.

Huntress is a cybersecurity company founded in 2015 by former NSA cyber operators, focused on protecting small to midsize businesses from cyber attacks through its award-winning security platform and expert human threat hunters. The company is fully remote and fosters a culture of inclusivity, innovation, and collaboration.

View details Similar jobs

Staff Software Engineer - Grafana Cloud k6

Grafana Labs 29 days ago

Germany 6w PTO

Build and scale a strong culture of operational excellence by defining standards and coaching teams to own reliability and availability.
Drive mature DevOps/SRE practices, including incident response and PIRs, on-call readiness, runbooks, alerting, observability, and release/change management.
Guide teams in the design, development, evolution, and operation of large-scale, distributed cloud systems.

Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana around the globe. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, and their team thrives in an innovation-driven environment.

View details Similar jobs

Senior Site Reliability Engineer

Loadsmart 22 days ago

Brazil Unlimited PTO

Collaborate with a tight-knit development team.
Design, deploy, and operate critical systems balancing reliability, cost, and agility.
Perform troubleshooting and root-cause analysis of system operation issues.

Loadsmart is a logistics technology company valued at over $1 billion. We are a collection of industry veterans and user-centered engineers using innovative technology to fearlessly reinvent the future of freight.

View details Similar jobs

Security Operations Manager

Unit4 12 hours ago

Global

Lead the Security Operations Team to protect global IT infrastructure, ensuring system confidentiality, integrity, and availability.
Oversee incident response, vulnerability management, and continuous security posture improvements across the organization.
Collaborate with IT, Engineering, and Compliance teams to embed security into every layer of the business.

Unit4 is a cloud ERP company redefining enterprise resource planning for mid-market people-centric organizations. With over 40 years of heritage, it fosters a people-first culture with a high-performance team and a focus on employee empowerment.

View details Similar jobs

Infrastructure Operations Specialist

Mercer Advisors 30 days ago

$32–$38/hr

US

Continuously monitor infrastructure, cloud platforms, identity systems, networking, and security tooling using centralized monitoring and alerting solutions.

Mercer Advisors helps families amplify and simplify their financial lives by integrating financial planning, investment management, business management, tax, estate, insurance, and more, managed by a single team. They serve over 31,300 families across 90+ cities in the U.S. and are ranked the #1 RIA Firm in the nation by Barron’s for two consecutive years.

View details Similar jobs

Source Job