As a Senior Site Reliability Engineer, you will partner with development teams to manage infrastructure, improve CI/CD pipelines, and support operational excellence across Growth and help ensure the reliability, scalability, and performance of the systems that power Kraken’s growth initiatives. You will bring your expertise in infrastructure, monitoring, and automation to ensure Kraken’s services are performant, resilient, and continuously improving.
Join the Development Tooling team to help maximize the efficiency and productivity of GitLab engineers. You'll be crucial in building and maintaining the foundational internal tools, frameworks, and infrastructure abstractions that empower all product developers to write, test, and ship reliable code faster. This role is ideal for a hands-on engineer who thrives on solving other engineers' pain points, enjoys working across different technology stacks, and has a passion for improving the software development lifecycle (SDLC).
As a Site Reliability Engineer at GitLab, you’ll keep user-facing services and production systems running smoothly by blending software engineering with infrastructure expertise. The ideal candidate is equally comfortable debugging Go applications and designing scalable Terraform automation across hundreds of environments. In the Environment Automation specialization, your focus is on operating and automating hundreds of GitLab environments.
Maintain, monitor, and improve AWS Kubernetes (k8s) and Nomad clusters supporting staking infrastructure. Maintain and improve the Infrastructure as Code (IaC) code base using Terraform, Terragrunt, Ansible, and other tools. Maintain and improve CI/CD pipelines using GitHub actions and ArgoCD for infrastructure and service deployments and configuration management.