Jobs Similar to Senior Site Reliability Engineer

Senior Site Reliability Engineer

Calendly 5 days ago

$198,025–$287,952/yr

Building tools and applications to extends Calendly’s infrastructure platform
Evaluating and deploying cloud native open source tools
Exercising expertise in cloud infrastructure concepts and patterns

Calendly's product powers connections for millions through impactful innovation. They are in the midst of exciting growth and desire people that want to learn, grow, and do their best work.

View details Similar jobs

Site Reliability Engineer

Newton 19 days ago

Canada

Implementing the improvements to the reliability, fault tolerance, scalability, and performance of our infrastructure
Managing incidents using your technical know-how to involve the appropriate teams and automate away manual practices
Improving observability across our systems (metrics, logs, tracing) to reduce time to detection and resolution

Newton is changing how Canadians trade crypto with the goal to make financial freedom achievable for everyone by giving their customers the tools and knowledge needed to navigate the crypto world. They are a remote team spread across Canada that values pushing boundaries and getting things done.

View details Similar jobs

Senior Site Reliability Engineer

Akuity 27 days ago

US Canada

Own SLI/SLO/SLA definitions for the Akuity SaaS platform and drive continuous improvement.
Participate in an on-call rotation and act as incident commander for high-severity production events.
Partner with engineering teams to build reliability into new features before they ship to production

Akuity helps enterprises ship software faster and more reliably with modern GitOps best practices. The Akuity Platform enables teams to manage the development and deployment across hundreds – if not thousands – of Kubernetes clusters from a single control plane.

View details Similar jobs

Site Reliability Engineer

Mistral AI 4 hours ago

Europe

Design, build, and maintain scalable, highly available and fault-tolerant infrastructures.
Implement and improve monitoring, alerting, and incident response systems to ensure optimal system performance and minimize downtime.
Drive continuous improvement in infrastructure automation, deployment, and orchestration.

Mistral AI is dedicated to democratizing AI through high-performance, optimized, open-source models, products, and solutions designed to integrate seamlessly into daily working life. They are a dynamic, collaborative team passionate about AI and its potential to transform society dedicated to innovation.

View details Similar jobs

Senior DevOps / Infrastructure Engineer

Bloomreach 10 days ago

Europe 5w PTO

Work with other Engineering teams to design sustainable infrastructure and microservice solutions.
Automate tools and infrastructure to reduce manual work.
Monitor applications and participate in an on-call rotation as required.

Bloomreach is building the world’s premier agentic platform for personalization, revolutionizing how businesses connect with their customers by building and deploying AI agents to personalize the entire customer journey. They power personalization for more than 1,400 global brands.

View details Similar jobs

Software Engineer / Site Reliability Engineer

LiveKit 26 days ago

Global

Build and own the foundational infrastructure that our products run upon.
Work directly on our products' golang code base to implement SRE related objectives.
Take a data driven approach to quantifying system performance and reliability.

LiveKit provides the network infrastructure for multimodal AI interfaces, enabling seamless audio and visual interactions. Founded in 2021, LiveKit supports over 3 Billion calls annually, with 100,000+ developers and industry giants like OpenAI, Spotify, and Meta.

View details Similar jobs

DevOps Engineer (GCP)

InspiredXpert 19 days ago

South Africa

Ensure reliability, uptime, and performance across GCP environments.
Implement SRE and DevOps best practices with strong focus on automation and scalability.
Build and optimize CI/CD pipelines using GCP-native tools.

InspiredXpert is a specialist IT Talent Solutions company providing high-quality contract or perm talent across software development, cloud, AI, cybersecurity, and data-driven roles. We connect skilled professionals with innovative companies, offering exciting opportunities to work on impactful projects across the globe.

View details Similar jobs

Site Reliability Engineer II

Backblaze 4 days ago

LATAM

Support the availability and durability of critical services across production environments.
Develop automation for common operational tasks, reducing manual intervention and toil.
Partner with engineering, product, and operations teams to support resilient system design and operations.

Backblaze is the object storage leader in the open cloud movement, fueling customer success with cloud storage built purposefully to unlock budgets and unleash innovators. Founded in 2007, they scaled the business with less than $3 million in outside funding until 2021, and generate over $100m in revenue managing over three billion gigabytes of data storage for 500K+ customers in 175+ countries.

View details Similar jobs

Senior Site Reliability Engineer

SSV Labs 4 days ago

Global

Design and implement infrastructure and tools that empower our product teams to rapidly and securely iterate, emphasizing reliability and automation.
Influence the strategic direction of our infrastructure and operational practices, ensuring that we are well-positioned to scale and support our growing organization.
Take a proactive role in the resolution of production issues, ensuring that we are well-prepared to handle incidents and that we learn from them in a blameless manner.

SSV Labs is the core team behind the SSV Network - pioneering decentralized infrastructure for Ethereum staking. They are building tools, protocols, and standards to make staking more secure, scalable, and trustless.

View details Similar jobs

Site Reliability Engineer

Planet 26 days ago

US Canada 16w maternity

Build and deploy computing services and infrastructure in customer environments.
Clarify and surface requirements from ambiguous use cases defined by cross-functional stakeholders.
Improve reliability and scalability by resolving edge cases, studying failure modes, and writing tests.

Planet designs, builds, and operates the largest constellation of imaging satellites in history. They deliver an unprecedented dataset of empirical information via a revolutionary cloud-based platform to authoritative figures in commercial, environmental, and humanitarian sectors. Planet has a people-centric approach toward culture and community and it strives to iterate in a way that puts their team members first and prepares their company for growth.

View details Similar jobs

Staff Software Engineer

Rula 28 days ago

US

Collaborate with application engineering teams on platform infrastructure.
Enhance observability and spearhead the adoption of SRE best practices.
Build and maintain reliable CI/CD pipelines, tooling, and infrastructure.

Rula strives to provide quality, evidence-based, compassionate mental healthcare and aims to create a world where mental health is no longer stigmatized. They are a remote-first company operating in most U.S. states, and are dedicated to having a culture of inclusion that supports their employees.

View details Similar jobs

Site Reliability Engineer, Production Reliability

Yelp 8 days ago

$135,000–$185,000/yr

Canada

Working with engineers across Yelp in supporting new features and services.
Integrating tools to monitor platform stability and performance.
Help scale our Kubernetes clusters and AWS-based infrastructure while maintaining our platform's SLOs.

Yelp's engineering culture values individual authenticity and encourages creative solutions. They focus on helping users, growing as engineers, and having fun in a collaborative environment.

View details Similar jobs

Senior Software Engineer (Golang, Kubernetes) - Cloud Compute Team

Canva 29 days ago

Designing, building, and operating Kubernetes infrastructure across multiple cloud providers.
Building and maintaining automation for cluster lifecycle management, node provisioning, and provider onboarding.
Developing platform tooling and abstractions that enable other Canva engineers to deploy and scale workloads.

Canva is a design platform redefining how the world experiences design. They have campuses in Sydney and Melbourne, along with co-working spaces in Brisbane, Perth and Adelaide, offering a flexible and inclusive work environment.

View details Similar jobs

Site Reliability Engineer (f/m/n)

InPost Group 26 days ago

Europe

Write code, automate everything, design for reliability, and deeply understand the systems.
Build or extend Terraform modules and contribute to Platform Engineering around Observability.
Collaborate with developers to shape feature design so that reliability is built in, not added later.

InPost Group is an innovative European out of home deliveries company, revolutionizing the way parcels are delivered to customers. With over 10,000 employees worldwide, InPost Group is one of the largest out of home delivery providers in Europe, committed to providing sustainable and efficient delivery solutions.

View details Similar jobs

Senior DevOps Engineer

Homebot 3 days ago

$145,000–$170,000/yr

US Unlimited PTO 12w maternity 12w paternity

Learn platform infrastructure, developer tooling, and deployment patterns.
Own and drive at least one architecture decision that improves platform reliability.
Ship infrastructure improvements that measurably improve developer experience or platform stability.

Homebot is a homeownership platform for lenders and real estate, title & insurance agents that drives client retention and partner referrals. They maintain a clear focus on culture, engagement, and creating an environment where people are valued and can thrive.

View details Similar jobs

Senior Site Reliability Engineer

Kraken 8 days ago

Americas

Manage and support infrastructure for Growth teams, including Nomad, Hashistack, databases, and any other underlying systems
Maintain and troubleshoot GitLab CI pipelines, ensuring reliable and fast build, test, and deployment cycles
Provide operational support across Onboarding, Acquire, and Engage teams, helping debug issues in staging and production environments

Kraken is a mission-focused company rooted in crypto values, aiming to accelerate the global adoption of crypto, so that everyone can achieve financial freedom and inclusion. As a fully remote company, they have Krakenites in 70+ countries who speak over 50 languages.

View details Similar jobs

Site Reliability Engineer

Weedmaps 12 days ago

$133,110–$148,042/yr

US

Collaborate with stakeholders to drive best practices for monitoring, CI/CD pipelines
Troubleshoot deployment issues in our CI pipeline
Identify areas for automation and embrace the codification of all things

Weedmaps is a global leader in the cannabis industry. They are dedicated to transparency, education, and community, serving cannabis to consumers and businesses in the U.S. and worldwide.

View details Similar jobs

Head of Infrastructure & Reliability

Finom 27 days ago

Europe

Lead the Infrastructure Engineering team, taking full ownership of cloud infrastructure, Kubernetes platforms, DevOps tooling, and CI/CD pipelines.
Drive reliability, scalability, and security across the production environment while maintaining a sharp focus on developer velocity and business impact.
Mentor and guide engineers across SRE, DevOps, and Database Reliability functions, fostering a culture of operational excellence and pragmatic problem-solving.

Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial services for entrepreneurs with an all-in-one B2B platform. They have raised $346 million, are expanding across key EU markets, and foster innovation, prioritizing research and solutions that benefit users, employees, partners, and the business.

View details Similar jobs

Senior Platform Engineer

Propel 26 days ago

$170,000–$240,000/yr

US 4w PTO

Own our fundamental cloud services and tooling.
Own our application platform.
Own our developer experience.

Propel builds technology that strengthens the social safety net. They are a passionate team of ~100 Propellers who envision a future where every American has the tools and resources they need to thrive, offering a remote-first working environment with headquarters in Brooklyn.

View details Similar jobs

Site Reliability Engineer

Ditto 4 days ago

Unlimited PTO

Develop and maintain observability solutions using platforms like Datadog, Prometheus and Grafana
Take a leading role in incident management, including coordinating response efforts, troubleshooting issues, and identifying follow-up actions
Partner with product engineering teams to architect reliable systems, recover from incidents, and learn from mistakes

Ditto is redefining how data moves at the edge, aiming to make resilient, real-time applications seamless for developers, regardless of network conditions. It's a globally distributed and fast-growing startup with over $145 million in funding that is committed to building a diverse and inclusive team.

View details Similar jobs

Source Job