Similar Jobs

See all

Key Responsibilities:

  • Architect and maintain self-healing systems with 99.9%+ availability targets.
  • Use AI/ML to automate infrastructure governance and detect configuration or IaC anti-patterns.
  • Implement adaptive SLIs/SLOs that evolve automatically from real-time data.

Key Requirements:

  • 10+ years in software/systems engineering, including 5+ years in SRE or platform reliability.
  • Strong experience with GCP (preferred) or AWS, Kubernetes, and Terraform.
  • Proficiency in Python or Go for automation and tooling.

What Success Looks Like:

  • 99.9%+ uptime sustained through predictive rather than reactive responses.
  • Faster MTTR via automated detection and auto-remediation.
  • Reliability insights used in leadership decisions.

Groupon

Groupon is a marketplace where customers discover new experiences and services everyday and local businesses thrive. Even with thousands of employees spread across multiple continents, they still maintain a culture that inspires innovation, rewards risk-taking and celebrates success.

Apply for This Position