Jobs Similar to Senior Software Engineer- Site Reliability Engineering (SRE) | TangerineFeed

Senior Software Engineer- Site Reliability Engineering (SRE)

Noctua Technology, LLC 18 hours ago

US

Drive the definition and adoption of SLIs and SLOs across services, reducing toil through automation and incident response.
Design and architect Infrastructure as Code solutions for large-scale environments using Docker, Kubernetes, and cloud-native services.
Serve as primary SRE liaison for development teams, influencing architecture and conducting training for clients.

Python Bash Go Terraform Kubernetes

20 jobs similar to Senior Software Engineer- Site Reliability Engineering (SRE)

Jobs ranked by similarity.

Senior Site Reliability Engineer

CertifyOS 11 days ago

US Unlimited PTO

Design and build cloud-native infrastructure for reliability, observability, and automation across GCP, GKE, and Cloud Run.
Own incident response, root cause analysis, escalation workflows, and runbooks to prevent hard problems from recurring.
Develop Infrastructure as Code, CI/CD pipelines, and operational tooling to improve developer velocity and platform efficiency.

CertifyOS is building the data infrastructure that powers modern healthcare, automating provider licensing, enrollment, credentialing, and network monitoring through an API-first platform. The company is backed by leading investors with a team of deep experience in provider data systems, valuing authenticity, accountability, collaboration, results, and openness to feedback.

View details Similar jobs

Senior Site Reliability Engineer (Remote Build)

Remote 13 days ago

Global Unlimited PTO 16w maternity 16w paternity

Own the operational excellence and infrastructure strategy for Remote Build's platform, ensuring reliability, performance, and security.
Lead incident response, build observability systems, and drive continuous improvement in system reliability.
Embed security into infrastructure, optimize costs, and automate operational toil to scale efficiently.

Remote solves modern organizations' biggest challenge of navigating global employment compliantly. With a fully distributed team across 6 continents, the company fosters a future-focused culture with core values of innovation and async work.

View details Similar jobs

Senior Site Reliability Engineer II - Infrastructure (AI Native)

Life360 13 days ago

Canada

Build and maintain infrastructure platforms for over 200 backend services running on Kubernetes clusters with 40,000+ cores.
Lead and mentor other engineers, own complex infrastructure failures, and participate in a shared on-call rotation.
Drive cloud cost efficiency, estimate schedules, and use AI tools as a first-class collaborator in daily workflows.

Life360's mission is to keep people close to the ones they love through location sharing, safe driver reports, and crash detection. The company serves approximately 97.8 million monthly active users across more than 180 countries and has more than 500 remote-first employees.

View details Similar jobs

Site Reliability Engineer (E3)

Vynca 12 days ago

US

Design, provision, and manage AWS infrastructure using Terraform and Kubernetes.
Build, operate, and improve observability, monitoring, and incident response processes.
Collaborate with engineering teams on capacity planning, performance optimization, and resilient system design.

Vynca provides comprehensive care for individuals with complex needs, focusing on quality days at home. The company is a close-knit community guided by core values of Excellence, Compassion, Curiosity, and Integrity.

View details Similar jobs

Site Reliability Engineer (SRE)

Synthesia 9 days ago

US

Take ownership of incident management and operational excellence across cloud infrastructure.
Automate high-risk manual processes and drive reliability gains through engineering.
Own a platform domain such as Temporal, observability, or Kubernetes operations.

Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London with offices across Europe and the US, and has over $530 million in funding from premier investors like Accel and Nvidia's VC arm.

View details Similar jobs

Sr. Production Engineer

Zscaler 12 days ago

US

Implement highly available, scalable infrastructure across AWS, GCP, and bare-metal environments.
Drive an "automation-first" culture by writing code in Python/Go to build self-healing systems.
Act as lead Incident Commander, develop response playbooks, and conduct post-incident analyses.

Zscaler accelerates digital transformation to secure customers with a cloud-native Zero Trust Exchange platform. The company processes over 200 billion transactions daily and fosters a culture of execution, collaboration, and accountability.

View details Similar jobs

Sr. Site Reliability Engineer

Versant 10 days ago

US

Lead design and operation of internal developer platforms and self-service infrastructure.
Build and optimize CI/CD pipelines, deployment workflows, and automation across GitHub Actions, Jenkins, ArgoCD.
Apply SRE principles to improve developer-facing systems and software delivery performance.

Versant is a media company owning iconic brands in news, sports, and entertainment, including USA Network, Fandango, and Rotten Tomatoes. It is an independent, publicly traded company with a collaborative, inclusive culture and a remote-first work environment.

View details Similar jobs

Senior Infrastructure Software Engineer

Mechanical Orchard 2 days ago

US

Work as part of a small, cross-functional XP team installing Imogen into client cloud environments, partnering with client infosec, infrastructure, and IT teams.
Pair program with other engineers and collaborate closely with product managers and designers.
Lead technical discovery efforts for existing customer systems and adapt Imogen to their public cloud estate.

Mechanical Orchard specializes in safely rewriting critical business applications using a unique method that eliminates modernization risks. The company is known for its expertise in Agile practices and has a small, cross-functional team culture focused on collective ownership and continuous improvement.

View details Similar jobs

Senior Site Reliability Engineer (SRE)

Oowlish 5 days ago

Latin America

Design, implement, and improve Site Reliability Engineering practices across production environments with a focus on SLOs, SLIs, and error budgets.
Lead incident response processes and build observability strategies including monitoring, logging, alerting, and distributed tracing.
Partner with engineering teams to enhance system reliability, availability, scalability, and operational efficiency.

Oowlish is a rapidly expanding software development company in Latin America that collaborates with premier clients from the United States and Europe to create pioneering digital solutions. Certified as a Great Place to Work, it offers a nurturing environment with opportunities for professional growth and international impact.

View details Similar jobs

Senior Site Reliability Engineer

MZLA Technologies Corporation 20 days ago

US 5w PTO

Design and develop CI/CD systems for websites, services, and release workflows, and operate an EKS-based Kubernetes platform.
Diagnose debug production incidents, drive root-cause analysis, and implement improvements to enhance system reliability.
Write and maintain infrastructure as code using Pulumi or Terraform/OpenTofu across multiple AWS accounts with security-conscious practices.

Thunderbird is one of the world’s most trusted open-source email applications, empowering more than 20 million people globally. Our small but growing distributed team includes 65+ people across seven countries, and we build privacy-respecting communication tools with a collaborative, inclusive, and user-first spirit.

View details Similar jobs

Sr. Site Reliability Engineer

Filevine 4 days ago

United States

Own and evolve observability strategy including monitoring, alerting, dashboards, logging, and distributed tracing.
Define and manage SLIs, SLOs, and reliability metrics, improving MTTD and MTTR through automation.
Build and maintain reliable cloud infrastructure on AWS and Kubernetes while mentoring engineers on SRE best practices.

Filevine is a Legal AI company delivering Legal Operating Intelligence for legal work. Fueled by a team of exceptional collaborators and innovators, Filevine’s rapid growth has earned AI awards and recognition from Deloitte and Inc. as one of the most innovative and fastest-growing technology companies in the country.

View details Similar jobs

Senior Site Reliability Engineer (Remote Build)

Jobgether 6 days ago

Germany Unlimited PTO

Design and maintain scalable infrastructure-as-code solutions using Terraform and Kubernetes.
Build and operate observability systems while leading incident response and reliability improvements.
Embed security and compliance practices into infrastructure and optimize system performance and cloud costs.

This partner company builds a next-generation platform enabling AI-driven services across global employment infrastructure. It is a highly distributed, async-first organization where engineers thrive in ownership and autonomy.

View details Similar jobs

Senior Infrastructure DevOps Engineer

Innocraft 11 days ago

Germany 6w PTO

Architect and scale the cloud platform behind a mission-critical SaaS product used globally.
Lead Infrastructure as Code maturity and drive automation, reliability, and cost optimisation.
Own uptime, SLAs, and incident management practices while mentoring engineers.

Innocraft (trading as Matomo) provides an open-source analytics platform trusted by enterprises and governments for full data ownership. The company values diversity and inclusion, and operates with a stable, mature product and strong engineering team.

View details Similar jobs

Site Reliability Engineer (SRE)

Supabase 9 days ago

Global

Collaborate with service teams to define SLIs and SLOs based on customer experience and build error budget policies that influence engineering decisions.
Own the Operational Readiness Review process, conducting reviews for new services and major changes across observability, alerting, runbooks, capacity, and graceful degradation.
Act as a reliability expert for architecture reviews, failure mode analysis, dependency mapping, and resilience design.

Supabase provides the Postgres development platform with a complete backend solution including Database, Auth, Storage, Edge Functions, Realtime, and Vector Search. With 280+ team members across 55+ countries, they are an open-source-first company that values async work and has raised $500M.

View details Similar jobs

Staff Engineer, Site Reliability

Babylist 17 days ago

US Canada

Own and evolve AWS infrastructure using Terraform, managing EKS clusters, databases, and core services.
Maintain CI/CD reliability and developer tooling across the full engineering org.
Lead incident response, drive post-incident reviews, and improve monitoring and alerting standards.

Babylist is the leading platform for expecting and new families, helping parents feel confident, connected, and cared for at every step. As a modern, AI-forward tech company with over 10 million yearly shoppers, Babylist has expanded into a full ecosystem and generated $750M in revenue in 2025, reshaping the $235B kids and baby market.

View details Similar jobs

Lead DevOps Engineer

NICE 21 days ago

UK

Design, build, and maintain CI/CD pipelines and Infrastructure as Code using tools like CloudFormation, Ansible, and Terraform.
Monitor and respond to infrastructure and application health, troubleshoot operational issues, and provide on-call support.
Maintain operational documentation, communicate proactively with teams, and ensure service delivery meets client expectations.

NICE Ltd. provides software used by 25,000+ global businesses, including 85 of the Fortune 100, to deliver customer experiences, fight financial crime, and ensure public safety. With over 8,500 employees across 30+ countries, NICE is recognized as a market leader in AI, cloud, and digital innovation.

View details Similar jobs

Sr. Cloud Platform Engineer

Applied 11 days ago

North America

Design, build, and maintain cloud infrastructure across Azure, GCP, and AWS, including landing zones, Kubernetes, and CI/CD pipelines.
Implement monitoring, security, and hybrid connectivity for enterprise-scale cloud environments.
Collaborate cross-functionally, mentor engineers, and leverage AI tools to accelerate infrastructure development.

Applied is an Insurtech company that builds technology solutions for insurance professionals. With over 40 years of experience, they foster a culture of trust, inclusion, and growth.

View details Similar jobs

Sr. Software Engineer/SRE - Remote UK

ItD 5 days ago

UK

Lead the design, development and operation of large-scale, secure observability systems to keep services online and performant.
Deploy and scale Prometheus, ElasticSearch clusters, and high-throughput Kafka data pipelines for millions of customer devices.
Collaborate with the Observability team to build alerting systems, APIs, and self-service monitoring tools using Terraform and multiple languages.

ItD is a new generation consulting and software development company that blends diversity, innovation, and integrity with real business results. It is a woman- and minority-led firm with a global community, empowering employees and offering benefits like medical, dental, vision, 401(k), and career development.

View details Similar jobs

Infrastructure Engineer

Gauntlet 2 days ago

US Canada

Build and maintain cloud infrastructure across GCP, Kubernetes, and Terraform.
Own CI/CD pipelines and deploy fully automated, locked-down systems.
Strengthen security, access control, and observability for a growing platform.

Gauntlet builds the financial systems of the future, operating across the entire stack to offer best-in-class vault products. The team serves over $1.5B in client TVL and brings together traditional finance and crypto-native expertise.

View details Similar jobs

Head of Site Reliability Engineering

Titan 7 days ago

US

Build the SRE practice from scratch: define SLO frameworks, on-call rotation, and incident command for live bank customers.
Define severity tiers, SLA commitments, and escalation paths for production support, acting as the technical owner during incidents.
Set engineering operations across sprint discipline, release rituals, code review standards, and compliance artifacts for bank examiners.

Titan builds AI software for banks, specializing in purpose-built small language models and AI bankers that financial institutions trust. The company is a backed fintech startup scaling from a handful to hundreds of customers, with a hands-on, build-first culture under strict compliance standards.

View details Similar jobs