Senior DevOps / Platform Reliability Engineer

Zingtree

Remote regions

Global

Benefits

Unlimited PTO

Role Responsibilities:

  • Build and maintain CI/CD pipelines, infrastructure automation, and observability tools to support multi-agent systems.
  • Collaborate with development, operations, and infrastructure teams to streamline processes and troubleshoot issues.
  • Strengthen security and compliance posture for SOC 2 and HIPAA using advanced cloud and DevOps practices.

Agentic AI Integration:

  • Design and operate auto-remediation agents for production toil with human-in-the-loop controls.
  • Use LLMs for incident triage and root cause analysis, integrating AI agents through the Model Context Protocol.
  • Establish operational guardrails and best practices for AI coding assistants in infrastructure repositories.

About You:

  • 5+ years of experience in DevOps, SRE, or Platform Engineering operating production systems on AWS.
  • Strong experience with CI/CD pipelines, production EKS environments, AWS networking, and Terraform.
  • Deep experience with Aurora/RDS MySQL, Redis, S3, Prometheus, Grafana, and OpenTelemetry for observability.

Zingtree

Zingtree is a next-generation intelligent process automation platform reimagining customer experience operations for enterprise support leaders. It is a small team with high ownership, emphasizing automation, collaboration, and transparency.

Apply for This Position