As a Staff Engineer, you will be a technical leader and individual contributor within our production operations function. You will be responsible for designing, building, and maintaining highly reliable, scalable, and performant cloud infrastructure and systems. You will play a critical role in driving technical excellence, mentoring junior engineers, and solving our most complex scalability and reliability challenges. Your day-to-day includes leading the design, implementation, and evolution of Greenlight's core cloud infrastructure and SRE practices. Act as a technical authority for complex SRE and cloud engineering challenges, providing expert guidance and solutions. Drive significant architectural improvements to enhance system reliability, resilience, and operational efficiency. Develop, maintain, and optimize our cloud infrastructure using Infrastructure as Code (primarily Terraform) and automation tools. Collaborate closely with development and security teams to embed SRE principles into the software development lifecycle, promoting secure and reliable coding practices. Design and implement robust monitoring, logging, and alerting solutions to provide comprehensive visibility into system health. Participate in and lead incident response, performing deep dive root cause analysis, and driving actionable blameless postmortems to prevent recurrence. Mentor and provide technical guidance to other SRE and Cloud Engineers, contributing to their growth and the team's overall technical capabilities. Research, evaluate, and advocate for new technologies and tools that can improve our operational posture and efficiency. Contribute to the strategic planning and roadmap development for the SRE and Cloud Engineering functions. Enhance existing services and applications to increase availability, reliability, and scalability in a microservices environment. Build and improve engineering tooling, process, and standards to enable faster, more consistent, more reliable, and highly repeatable application delivery.