Job Description

Design and deploy large-scale GPU clusters for customers training frontier AI models. You'll be the technical expert who partners with clients to architect, implement, and optimize GPU infrastructure ranging from hundreds to thousands of GPUs. This role requires deep expertise in SLURM/Kubernetes orchestration, InfiniBand networking, and NVIDIA GPU systems. You'll work hands-on with customers to ensure their infrastructure runs at peak performance, from initial design through production deployment. Ideal candidates have 3+ years of HPC/GPU experience and a track record of successful large-scale deployments.

Remote regions

Benefits

Job Description