Job Description

The Infrastructure Engineer will join the Infrastructure Engineering team. The engineers design and build automation, tooling, and systems that bridge the gap between physical infrastructure and the platforms that power large-scale AI/ML and HPC workloads. This role combines the breadth of a core infrastructure engineer with a specialty in high-performance networking and GPU communication. You’ll help ensure our InfiniBand fabric and NCCL stack are tuned, reliable, and efficient at scale β€” supporting some of the world’s largest GPU clusters. Responsibilities include designing, building, and maintaining automation, APIs, and frameworks to manage physical infrastructure at scale, developing and extending systems for server lifecycle management, and implementing and tuning InfiniBand networking and NCCL configurations for multi-GPU communication. You will also collaborate with Network, Platform, and Infrastructure Operations teams to support new infrastructure rollouts, diagnose and improve performance across GPU, NVSwitch, PCIe, and InfiniBand layers, and write clear design documents and technical documentation to capture best practices.

About Voltage Park

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit.

Apply for This Position