Similar Jobs
See allSr. Site Reliability Engineer
Filevine
United States
Datadog
Grafana
AWS
Staff Engineer, Site Reliability
Babylist
US
Terraform
AWS
Kubernetes
Site Reliability Engineer (E3)
Vynca
US
AWS
Terraform
Kubernetes
Site Reliability Engineer (SRE)
Synthesia
US
AWS
Kubernetes
MongoDB
Senior Site Reliability Engineer
DexCare
US
AWS
Terraform
Python
About the Role:
- We are looking for a Site Reliability Engineer Contractor to join our Foundation / SREIQ team and help keep IPSY fast, available, and resilient.
- You will own observability and alerting in Datadog, participate in on-call, and automate toil out of operations.
What You'll Be Doing:
- Build and maintain observability across our platform in Datadog: dashboards, monitors, APM, log pipelines, and meaningful alerting.
- Define and track SLIs, SLOs, and error budgets for specific services and drive reliability conversations.
- Participate in on-call rotation and serve as an SRE Partner during incidents, driving incident response per our framework.
What We're Looking For:
- Hands-on experience with observability and monitoring tooling, ideally Datadog.
- Experience participating in on-call and incident response, including triage and post-incident reviews.
- Scripting and automation skills (e.g., Python, Bash) and familiarity with CI/CD and infrastructure-as-code (e.g., Terraform).
IPSY
IPSY is a beauty subscription platform that connects brands and consumers through curated beauty products. It is a remote-first company with a focus on community and engagement.