Source Job

$175,000–$195,000/yr
Americas Unlimited PTO 16w maternity

  • Lead effective squad rituals and ensure production readiness.
  • Partner with engineers to ensure solutions are scalable, architecturally sound, flexible, and secure.
  • Provide timely, specific coaching and development opportunities for your direct reports.

SRE SaaS Infrastructure Engineering Management

20 jobs similar to Site Reliability Engineering Manager

Jobs ranked by similarity.

US Canada Europe

  • Lead a global team of Site Reliability Engineers.
  • Recruit, hire, onboard and develop engineers.
  • Guide project planning by defining milestones and identifying dependencies.

AuthZed creates and maintains SpiceDB and the authorization infrastructure. They are a Series A company with a fully remote team across the US, Canada, and Europe and a hardworking, close-knit group with a software-driven culture that values integrity, collaboration, and open-mindedness.

$150,000–$167,000/yr
US

  • Lead reliability-focused design and readiness reviews.
  • Build, operate, and continuously improve our observability stack.
  • Own and evolve incident management practices.

Transcend is building the privacy platform that easily embeds privacy into your entire tech stack. They are growing quickly, backed by top-tier investors and are proud to serve some of the world's most iconic brands.

$109,800–$252,500/yr
US Unlimited PTO 16w maternity 8w paternity

  • Design, implement, and maintain scalable and reliable infrastructure solutions.
  • Automate deployments and maintain a resilient, secure SaaS application platform.
  • Develop comprehensive monitoring and alerting solutions, and respond to incidents.

Veeam is the #1 global market leader in data resilience, believing businesses should control all their data whenever and wherever they need it, providing data resilience through data backup, data recovery, data portability, data security, and data intelligence. Based in Seattle, Veeam protects over 550,000 customers worldwide who trust Veeam to keep their businesses running.

$230,316–$349,538/yr
US

  • Ensuring the productivity, career development, and satisfaction of your teams
  • Enhancing our infrastructure offering by applying industry best practices and processes
  • Brainstorming solutions to, and refining the scope of, your team’s infrastructure work

Calendly's platform makes it easy for millions to schedule meetings. This exciting company offers employees the opportunity to learn, grow, and do their best work with talented colleagues.

APAC

  • Lead and develop a high-performing GitLab SaaS Production Engineering team.
  • Drive the unification of platforms, tooling, and processes.
  • Collaborate with teams to define, prioritize, and manage the team roadmap.

GitLab is an open-core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. Their high-performance culture is driven by their values and continuous knowledge exchange, enabling their team members to reach their full potential.

US

  • Ensure near-zero downtime with monitoring and alerting, self-healing automation, and continuous improvement
  • Create highly automated, available and scalable systems by applying software and infrastructure principles
  • Employ and advise clients on DevOps and SRE principles and practices, covering deployment pipelines, HA, service reliability, technical debt, and operational toil for live services running at scale

66degrees is an AI transformation partner. They guide enterprises from business challenges to quantifiable outcomes, helping businesses reach their inflection point where chaotic data becomes a strategic asset, complexity becomes clarity, and AI becomes an engine for growth. They believe in thriving through challenges and winning together.

US Canada Europe Asia

  • Automate the provisioning of all of Juniper Square’s infrastructure in code.
  • Partner with our Platform Engineering team on building developer tooling / improving developer experiences via joint initiatives and enhancements.
  • Partner with our Data Engineering team on improving our data posture and driving operational excellence.

Juniper Square's mission is to unlock the full potential of private markets by digitizing them to bring efficiency, transparency, and access. They are a values-driven organization with a hybrid workplace strategy, allowing employees to collaborate effectively across multiple countries and offering physical offices in several major cities.

US Canada

  • Maintain tooling, libraries, and infrastructure leveraged by core service teams
  • Develop and maintain infrastructure services that enable engineers to manage, deploy, and scale systems
  • Act as a technical leader, guiding core service teams to design robust and reliable software

StackAdapt is a technology company that empowers marketers to reach, engage, and convert audiences with precision. They are an AI-powered platform connecting brand and performance marketing, recognized for their diverse workplace and high-performing campaigns.

US Canada Unlimited PTO

  • Partner with Product, Design, Customer Success, and Finance to align priorities and deliver measurable customer and business outcomes.
  • Build an inclusive, high-performing team culture grounded in trust, ownership, and continuous improvement.
  • Ensure systems are designed and operated for reliability, scalability, and resilience; lead incident readiness and post-incident learning.

Traackr is a global SaaS technology company providing a data-driven influencer marketing platform. They help marketers optimize investments, streamline campaigns, and scale programs; their customers range from large companies to indie brands. They are a remote-first company, with offices in San Francisco, New York, Boston, Paris, and London and operate on a culture of mutual respect.

Global

  • Partner with engineers to build dev tools that empower developer workflows and deployment infrastructure.
  • Ensure reliability of multi-cloud Kubernetes clusters and pipelines.
  • Metrics, logging, analytics, and alerting for performance and security across all endpoints and applications.

Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Their platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices.

Europe

  • Own the reliability, scalability, and performance of Peec AI’s core systems and infrastructure
  • Design, build, and maintain the tooling, automation, and monitoring that keep our services fast, secure, and highly available
  • Partner closely with product and engineering teams to ensure new features are reliable, observable, and easy to operate from day one

Peec AI is one of Europe’s fastest-growing Series A startups (no employee count/culture details given). They provide exciting and challenging work in the AI space.

$113,082–$175,725/yr
Canada

  • Operate and maintain large-scale data systems, ensuring stability and performance.
  • Design, implement, and optimize deployment processes using virtualization.
  • Monitor system health, analyze failures, and identify instability sources.

Jobgether is a platform that uses AI-powered matching to connect candidates with companies. They ensure applications are reviewed quickly, objectively, and fairly, then share a shortlist of top candidates directly with the hiring company.

US Unlimited PTO

  • Contribute to high impact AWS cloud infrastructure initiatives.
  • Participate in operability and production readiness reviews.
  • Advocate and implement Site Reliability Engineering practices.

Patreon is a media and community platform where creators give fans access to exclusive work. They have generated over $10 billion for creators and have 25 million+ paid memberships, with a hybrid work model and offices in New York and San Francisco.

US

  • Ensure the smooth operation and high availability of Clarifai's core services
  • Monitor system performance, identify bottlenecks, and implement optimizations to enhance reliability and efficiency
  • Design and implement scalable, secure, and cost-effective infrastructure solutions

Clarifai is a leading AI platform specializing in computer vision and generative AI, empowering organizations to transform unstructured data into actionable insights. Founded in 2013, they have a diverse, globally distributed team with $100M in funding and are committed to building a diverse and inclusive team.

US

  • Build and lead a high-performing engineering team.
  • Design, build, and scale new user experiences end-to-end.
  • Collaborate with Product and Design partners to deliver scalable systems.

Rula is dedicated to treating the whole person and creating a world where mental health is embraced. They aim to empower individuals to take charge of their mental health and achieve their full potential in the field of mental healthcare.

Europe Middle East Africa

  • Design, deploy and maintain a cloud infrastructure to support a Dataiku SaaS offering mainly on AWS and Azure and GCP
  • Continuously improve the infrastructure, deployment and configuration to deliver more reliable, resilient, scalable and secure services
  • Automate as much as possible all technical operations

Dataiku is The Universal AI Platform™, giving organizations control over their AI talent, processes, and technologies to unleash the creation of analytics, models, and agents. They connect many data science technologies and integrate the best of data and AI tech.

Global

  • Design and implement reliable and scalable AWS architecture.
  • Support the CICD process with ArgoCD and GitOps, automating deployments with Terraform.
  • Optimize system performance and troubleshoot issues, collaborating with development teams.

Cloudbeds is transforming hospitality with its intelligently designed platform that powers properties across 150 countries. They are a completely remote team of 650+ employees across 40+ countries, focused on building AI-powered solutions for hotels.

US

  • Work directly with customers to ensure successful Teleport deployments.
  • Meet regularly with customers, understand pain points blocking deployments and remove roadblocks.
  • Work with customers to articulate the problem they are trying to solve, gather requirements, and make the business case to the product and engineering teams to invest in resolving the issue.

Teleport is the Infrastructure Identity Company, modernizing identity, access, and policy for infrastructure, improving engineering velocity and resiliency of critical infrastructure against human factors and/or compromise. They are a fast-growing, well-funded Y-Combinator company that values craft, strongly supports work/life balance, and embraces a culture of humility, honesty, and transparency.

Global

  • Own and operate core platform systems across AWS, GCP, Vercel, Github, and Cloudflare.
  • Improve reliability, scalability, and security of production and non-production environments.
  • Improve local development environments and onboarding experience for engineers.

Moxie empowers ambitious aesthetic entrepreneurs to build profitable, independent practices. A global, remote-first team of more than 140 people supports hundreds of practices nationwide as they unlock sustainable success for aesthetic entrepreneurs.

$170,000–$220,000/yr
US 3w PTO

  • You'll own infrastructure as a product, serving Atticus's product engineering teams as your customers.
  • Shaping our infrastructure roadmap — developing a clear vision for where our infra needs to go and driving progress toward it
  • Empowering product teams — you'll build the platforms and tools that let them own their systems end-to-end

Atticus makes it easy for any sick or injured person in crisis to get the life-changing aid they deserve. In 2025, their team grew to 210, and they will grow again in 2026; they have ambitions to create a category-defining business assisting needy Americans.