Staff Software Engineer, ML Platform

Cake

Remote regions

US

Benefits

Unlimited PTO

Similar Jobs

See all

Build Enterprise-Scale Infrastructure:

  • Leverage infrastructure-as-code to manage complex cloud environments supporting critical ML and AI initiatives.
  • Design Kubernetes-native systems, including controllers/operators where appropriate.
  • Improve platform networking, security, and observability.

Sustain Platform Health and Performance:

  • Own critical systems in production, including reliability, scalability, security, and cost efficiency.
  • Identify and proactively address technical debt, operational risk, and platform bottlenecks.
  • “Learn by doing” — Quickly ramp up to a complex tech stack (Terraform, Kubernetes, Istio, Crossplane, Go, TypeScript).

Enable Teams and Customers to Move Faster:

  • Create abstractions and tooling that make it easier for teams and customers to deploy, run, and scale AI/ML workloads.
  • Collaborate directly with customers to understand their ML infrastructure challenges and translate them into platform improvements.
  • Balance speed and rigor—shipping quickly while maintaining a high bar for quality and safety.

Cake

Cake is on a mission to make cutting-edge AI accessible to enterprise teams. Backed by top investors, Cake is seeing strong adoption and is positioned for rapid growth in the next 12 months, emphasizing ownership, clear communication, and collaboration.

Apply for This Position