Weโre looking for an experienced Senior DevOps Engineer to improve our observability and performance monitoring systems. Youโll define the patterns and practices that shape the next evolution of our observability stack and lay the groundwork for future reliability initiatives. Youโll help monitor critical services, establish standards, and ensure a stable, secure, and scalable observability platform for our engineering teams. This is a unique opportunity to apply your expertise across reliability engineering, cloud infrastructure, software development, and operations to make a meaningful impact across the organization.
You'll architect and manage our cloud observability platform which aggregates logs, traces, and time-series metrics from infrastructure and applications. You'll develop and implement custom dashboards to help monitor important metrics across our entire stack including mobile applications, apis, databases, and cloud platforms. You will also establish and promote SRE best practices such as well-architected reviews, change management and incident response.