Source Job

US Unlimited PTO

  • Own production reliability for customer-facing platforms and weather data services across Azure, colocation, and edge Kubernetes environments.
  • Drive multi-replica and multi-cluster high availability for .NET services, including hands-on C# refactoring for safe horizontal scaling.
  • Participate in a 24/7 on-call rotation, leading incident response, postmortems, and operational excellence initiatives.

C# .NET Kubernetes Azure Terraform

20 jobs similar to Senior Site Reliability Engineer (C#, .NET)

Jobs ranked by similarity.

  • Co-own the architecture of cloud infrastructure on Azure and Kubernetes clusters for high throughput and availability.
  • Drive resilience strategy for global scaling, zero-downtime deployments, and disaster recovery.
  • Evolve observability stack with LGTM (Loki, Grafana, Tempo, Mimir) and lead incident response.

Flip is an AI-powered employee experience platform for frontline workers in retail, manufacturing, and logistics. The company is a young, rapidly growing tech company with a remote-first culture and offices in Berlin and Stuttgart.

US

  • Ensure reliability, scalability, and performance of hosted healthcare platforms.
  • Lead incident response, root cause analysis, and implement proactive monitoring.
  • Automate operational tasks using scripting and Infrastructure-as-Code.

Altera Digital Health empowers healthcare providers to deliver superior care through innovative technology. The company is part of Constellation Software Inc., Canada's largest software company, offering a supportive and award-winning culture with opportunities for growth.

United States

  • Build, architect, and scale cloud agnostic platform components.
  • Design and develop scalable .NET applications deployed on AWS using Docker and Kubernetes.
  • Implement CI/CD pipelines and ensure zero-downtime releases.

Conga lines up commercial operations so companies run as connected, smarter businesses. More than 10,000 customers worldwide, including over 50% of the Fortune 100, trust Conga.

Canada

  • Build and maintain infrastructure platforms for over 200 backend services running on Kubernetes clusters with 40,000+ cores.
  • Lead and mentor other engineers, own complex infrastructure failures, and participate in a shared on-call rotation.
  • Drive cloud cost efficiency, estimate schedules, and use AI tools as a first-class collaborator in daily workflows.

Life360's mission is to keep people close to the ones they love through location sharing, safe driver reports, and crash detection. The company serves approximately 97.8 million monthly active users across more than 180 countries and has more than 500 remote-first employees.

US

  • Owning cloud infrastructure on Azure, data pipeline orchestration, CI/CD, and observability to ensure production-grade reliability.
  • Building and maintaining foundational infrastructure that enables fast engineering velocity without breaking things.
  • Applying SRE principles such as SLOs, capacity planning, incident response, and eliminating toil through automation.

Terzo's platform processes enterprise-scale document corpora, powers real-time AI agents, and serves the Financial Intelligence Graph to Fortune 500 customers. As a small, senior team with strong ownership and minimal bureaucracy, we foster a culture of collaboration, mentorship, and continuous improvement.

Europe

  • Design and operate our Kubernetes ecosystem with a focus on high availability and zero-downtime operations.
  • Own and evolve our PaaS strategy, using GitOps and CI/CD to empower domain teams to deploy independently.
  • Define and implement our observability strategy across metrics, logs, and tracing.

Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial services for entrepreneurs. They offer an all-in-one financial B2B solution integrating banking, accounting, financial management, and invoicing into a mobile-first platform, with about 346 million in funding.

Latin America

  • Monitor critical production systems using advanced dashboards and proactive alerting.
  • Act as the primary technical responder for live production incidents and Slack escalations.
  • Collaborate deeply with core DevOps and software engineering teams to elevate platform reliability.

Inallmedia.com is a global technology and design firm focused on building impactful digital solutions through remote, distributed teams across LATAM. They partner with international clients across industries, providing long-term technical expertise, product innovation, and team augmentation.

Global

  • Work with a modern tech stack including .NET, React, and cloud-native environments.
  • Take ownership of end-to-end feature delivery from analysis through to production support.
  • Collaborate directly with customer engineering teams to propose effective technical solutions.

Sigma Software delivers high-quality software solutions for over 20 years. They collaborate with a well-known German automotive manufacturer to develop impactful solutions for preventing production faults.

United States

  • Own and evolve observability strategy including monitoring, alerting, dashboards, logging, and distributed tracing.
  • Define and manage SLIs, SLOs, and reliability metrics, improving MTTD and MTTR through automation.
  • Build and maintain reliable cloud infrastructure on AWS and Kubernetes while mentoring engineers on SRE best practices.

Filevine is a Legal AI company delivering Legal Operating Intelligence for legal work. Fueled by a team of exceptional collaborators and innovators, Filevine’s rapid growth has earned AI awards and recognition from Deloitte and Inc. as one of the most innovative and fastest-growing technology companies in the country.

US

  • Take ownership of incident management and operational excellence across cloud infrastructure.
  • Automate high-risk manual processes and drive reliability gains through engineering.
  • Own a platform domain such as Temporal, observability, or Kubernetes operations.

Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London with offices across Europe and the US, and has over $530 million in funding from premier investors like Accel and Nvidia's VC arm.

Americas 7w PTO

  • Act as a first responder for system incidents and outages, ensuring high availability and performance.
  • Own and evolve monitoring, alerting, and log management systems while optimizing database infrastructure.
  • Collaborate with engineering teams to build scalable, resilient systems and contribute to SRE tooling and automation.

Circle is building the world's leading all-in-one platform for online communities. We're a fully remote company of around 200 team members from 30+ countries, with a culture that values autonomy, async collaboration, and high expectations.

US Canada

  • Own and evolve AWS infrastructure using Terraform, managing EKS clusters, databases, and core services.
  • Maintain CI/CD reliability and developer tooling across the full engineering org.
  • Lead incident response, drive post-incident reviews, and improve monitoring and alerting standards.

Babylist is the leading platform for expecting and new families, helping parents feel confident, connected, and cared for at every step. As a modern, AI-forward tech company with over 10 million yearly shoppers, Babylist has expanded into a full ecosystem and generated $750M in revenue in 2025, reshaping the $235B kids and baby market.

US Unlimited PTO

  • Design and build cloud-native infrastructure for reliability, observability, and automation across GCP, GKE, and Cloud Run.
  • Own incident response, root cause analysis, escalation workflows, and runbooks to prevent hard problems from recurring.
  • Develop Infrastructure as Code, CI/CD pipelines, and operational tooling to improve developer velocity and platform efficiency.

CertifyOS is building the data infrastructure that powers modern healthcare, automating provider licensing, enrollment, credentialing, and network monitoring through an API-first platform. The company is backed by leading investors with a team of deep experience in provider data systems, valuing authenticity, accountability, collaboration, results, and openness to feedback.

North America

  • Design, build, and maintain cloud infrastructure across Azure, GCP, and AWS, including landing zones, Kubernetes, and CI/CD pipelines.
  • Implement monitoring, security, and hybrid connectivity for enterprise-scale cloud environments.
  • Collaborate cross-functionally, mentor engineers, and leverage AI tools to accelerate infrastructure development.

Applied is an Insurtech company that builds technology solutions for insurance professionals. With over 40 years of experience, they foster a culture of trust, inclusion, and growth.

Global Unlimited PTO 16w maternity 16w paternity

  • Own the operational excellence and infrastructure strategy for Remote Build's platform, ensuring reliability, performance, and security.
  • Lead incident response, build observability systems, and drive continuous improvement in system reliability.
  • Embed security into infrastructure, optimize costs, and automate operational toil to scale efficiently.

Remote solves modern organizations' biggest challenge of navigating global employment compliantly. With a fully distributed team across 6 continents, the company fosters a future-focused culture with core values of innovation and async work.

Unlimited PTO 16w maternity 16w paternity

  • Own and operate customer-facing managed infrastructure across multiple AWS accounts and regions.
  • Serve as the senior technical escalation point for production incidents and complex configurations.
  • Contribute to OpenTelemetry distributions and maintain open source projects like Refinery.

Honeycomb provides observability for developer tools, helping companies like HelloFresh and Slack understand their software. They have over 200 employees and were named to Forbes' Best Startups in 2022 and 2023, with a culture that values inclusion and autonomy.

Global

  • Design and develop backend services and APIs using C# and .NET.
  • Implement and maintain microservices-based solutions ensuring performance and scalability.
  • Conduct code reviews, mentor engineers, and collaborate on CI/CD pipelines and deployments.

Nortal is a global digital transformation company that delivers complex solutions for enterprises and public sector organizations. With over 160 new hires per year, they foster a collaborative culture that values autonomy, open communication, and respect for diversity.

US

  • Implement highly available, scalable infrastructure across AWS, GCP, and bare-metal environments.
  • Drive an "automation-first" culture by writing code in Python/Go to build self-healing systems.
  • Act as lead Incident Commander, develop response playbooks, and conduct post-incident analyses.

Zscaler accelerates digital transformation to secure customers with a cloud-native Zero Trust Exchange platform. The company processes over 200 billion transactions daily and fosters a culture of execution, collaboration, and accountability.

Brazil

  • Design, develop, and maintain backend APIs and services using .NET Core / .NET 6+ with a focus on scalability.
  • Collaborate on system architecture, ensure code quality through testing and best practices, and implement observability solutions.
  • Work with cloud infrastructure in GCP, including Docker and Kubernetes, and participate in agile ceremonies and code reviews.

Jobgether uses AI-powered matching to connect candidates with hiring companies. It operates as a platform that processes applications and shares shortlists with employers.