Sr. Staff Production Engineer at Zscaler

Design and implement highly available, scalable infrastructure across AWS, Azure, GCP, and bare-metal environments
Drive an "automation-first" culture by writing code (Python/Go) to eliminate manual toil and build self-healing systems
Implement and maintain sophisticated observability (Prometheus, Grafana, OpenTelemetry), define SLIs/SLOs, and establish error budgets

Minimum Qualifications:

8+ years of experience managing reliability, scalability, and availability for large-scale production services
Deep expertise in programming (e.g., Python, Go, or C/C++)
Strong background in networking protocols, Linux/FreeBSD systems, and distributed architecture

Preferred Qualifications:

Extensive experience with public cloud (AWS, Azure, GCP) and Infrastructure-as-Code (Ansible, Terraform)
Experience with chaos engineering and disaster recovery planning at scale
Expertise in global routing (BGP) and traffic tunneling (GRE, IPSec) with a deep understanding of L7 proxy architectures (HAProxy), DNS at scale, and OS networking stack internals

Zscaler

Zscaler accelerates digital transformation to ensure our customers can be more agile, efficient, resilient, and secure. They are an AI-forward enterprise, constantly pushing the envelope, leveraging the world’s largest security data lake.

Apply for This Position