Job Description
Weβre looking for a Software Engineer II to join our SiteOps team, which is focused on scaling DevOps, site reliability, security engineering and FinOps best practice across the engineering team. Reliability is the most important feature of Hudl.com. If our site isnβt available and performant, then nothing else matters. In this role you will be part of the Platform Engineering team but will work really closely with the wider product team, to successfully build adoption for new observability technologies, process and system architecture.
In this role, you will: Define, document, and drive adoption for the processes and tools used to improve production alerting and incident response sitewide at Hudl. Understand, evangelize, and help implement the best strategies and tools for faster discovery and resolution of production incidents. Help Hudl define and measure reliability metrics, such as MTTD, MTTR and availability. Youβll help teams become more accountable for individual microservice metrics. Collaborate and embed with teams to eliminate architectural weaknesses or anti-patterns across our systems. Take on-call shifts a few times a year.
About Hudl
Hudl builds great teams and hires the best of the best to ensure youβre working with people you can constantly learn from.