Contribute to the design, implementation, and maintenance of Greenlight's core cloud infrastructure and Site Reliability Engineering (SRE) practices ensure high availability, scalability, and performance. Develop, maintain, and optimize our cloud infrastructure using Infrastructure as Code (primarily Terraform) and other automation tools. Collaborate closely with development and security teams to embed SRE principles into the software development lifecycle, promoting secure and reliable coding practices. Design and implement robust monitoring, logging, and alerting solutions to provide comprehensive visibility into system health. Actively participate in and support incident response, performing deep-dive root cause analysis, and contributing to actionable blameless postmortems to prevent recurrence. Identify and implement architectural improvements to enhance system reliability, resilience, and operational efficiency. Automate operational tasks and processes to reduce toil and improve efficiency. Research, evaluate, and advocate for new technologies and tools that can improve our operational posture and efficiency. Enhance existing services and applications to increase availability, reliability, and scalability in a microservices environment. Build and improve engineering tooling, processes, and standards to enable faster, more consistent, more reliable, and highly repeatable application delivery.