Job Description

Building and operating Kubernetes compute superclusters across multiple clouds. Partnering with cloud providers to optimize infrastructure costs, performance, and reliability for AI workloads. Working closely with research teams to understand their infrastructure needs and identify ways to improve stability, performance, and efficiency of novel model training techniques. Designing and building resilient, scalable systems for training AI models, focusing on creating intuitive user interfaces that empower researchers to self-serve to troubleshoot and resolve problems. Encouraging software best practices across our company and participating in team processes such as knowledge sharing, reviews, and on-call.

About Cohere

Our mission is to scale intelligence to serve humanity, training and deploying frontier models for developers and enterprises building AI systems.

Apply for This Position

Benefits

Job Description

About Cohere