As a Senior SRE, you will play a pivotal role in ensuring the reliability, scalability, and performance of our services. You will lead efforts in building and maintaining a robust infrastructure, automating processes, and guiding the team to implement best practices in site reliability.
On a daily basis, the impact includes on-call Production Support, working on customer and internal engineering tickets, and working on SRE backlog items. Monitor and Maintain Systems to ensure high availability. Automate processes to streamline operations and reduce manual intervention. Participate in designing and implementing system improvements to enhance reliability, scalability, and performance.
Technical skills include expertise in Linux/Unix systems and cloud platforms (AWS, Azure, or Google Cloud), strong proficiency in scripting languages and programming languages and familiarity with AI/ML operations.