Job Description
The Astronomer Customer Reliability Engineering (CRE) team is responsible for the success of our customers' usage of our managed Airflow service. The CREs are responsible for operating, monitoring, and maintaining the platform to ensure availability, predictability, and reliable operations. As an infrastructure specialist, you will learn to become an expert on the reliability of Kubernetes and the underlying cloud infrastructure on all 3 public clouds (AWS, Azure, and GCP). Our CRE team ensures production environments are available, predictable, and reliable for our customers. You will create strong relationships with customers and help them achieve their reliability goals.
Responsibilities include learning and building expertise across several software engineering disciplines (Kubernetes, Cloud engineering, Cloud networking), gaining broad exposure to product, engineering, and customer relationship management, and spending time on side-projects that contribute to Astronomer’s overall success, such as contributing to the open-source Airflow repository or developing Astronomer’s internal monitoring and alerting systems built on Airflow.
You will be working directly with our customers’ data engineers, system admins, DevOps teams, and management, providing feedback from your experience that can shape the direction of Astronomer’s products, owning the customer experience by working directly with customers to prioritize and solve issues and meet SLAs, participating remotely within a fully distributed team, helping maintain 24x7 coverage, and participating in paid on-call rotation for weekend coverage.
About Astronomer
Astronomer empowers data teams to bring mission-critical software, analytics, and AI to life and is the company behind Astro, the industry-leading unified DataOps platform.