We're looking for a Site Reliability Engineer to join Masabi and be at the forefront of ensuring our platform's reliability, performance, and security. In this role, you'll be pivotal to scaling and modernising our platform while ensuring uptime, performance, and security. You'll work across legacy and modern infrastructure, drive key improvements, and collaborate closely with architecture and product teams to enable reliable delivery across the business.
You will drive automation, refine processes, ensure security, implement failover strategies, and respond to incidents. Partner with developers, coach teams, and maintain detailed documentation. The platform is JVM-based and cloud-native, hosted on AWS, utilising tools like Gitlab, Terraform, CloudFormation, Puppet, Kibana, Grafana and Confluent Cloud.