Staff Site Reliability Engineer

SmarterDx

Remote regions

US

Salary range

$230,000–$250,000/yr

Benefits

Unlimited PTO 12w paternity

Similar Jobs

See all

Responsibilities:

  • Define and evolve reliability standards for the SmarterDx platform, including SLIs, SLOs, and error budgets that align engineering work with customer impact.
  • Implement a “reliability” platform using Terraform and infrastructure-as-code best practices.
  • Enhance observability systems (metrics, logs, traces, alerting) to provide actionable insights and reduce mean time to detect (MTTD) and resolve (MTTR).

Qualifications:

  • 10+ years of software and software reliability engineering experience, with significant time spent operating and scaling distributed systems in production environments.
  • 3+ years of hands-on experience running cloud-native infrastructure in AWS, including deep familiarity with containers, Kubernetes, monitoring, and alerting in live production systems.
  • Strong expertise with Terraform and infrastructure-as-code practices for managing production infrastructure safely and reproducibly.

Tech Stack:

  • AWS
  • Terraform
  • Kubernetes

SmarterDx

SmarterDx, a Smarter Technologies company, builds clinical AI that is transforming how hospitals translate care into payment. Founded by physicians in 2020, their platform connects clinical context with revenue intelligence, helping health systems recover millions in missed revenue, improve quality scores, and appeal every denial.

Apply for This Position