Job Description
Pythian is building a next-generation Site Reliability Engineering team, seeking talented, motivated engineers who thrive in fast-paced, problem-solving environments. As an SRE, the role involves designing, deploying, and operating large-scale distributed systems across compute, storage, networking, and AI/ML environments. You'll lead projects from architecture to automation to intelligent monitoring, collaborating with both clients and teammates to build resilient, high-performing infrastructure. The role involves operating and optimizing Kubernetes clusters, Istio service mesh, and Linux-based systems, automating workflows using Go, Python, and Shell scripting. Build monitoring and observability solutions with Prometheus, Grafana, and Loki.
About Pythian
Pythian is an expert in strategic database and analytics services, driving digital transformation and operational excellence.