Remote Software engineering Jobs • Ray

3 results

Job listings

Design, build, and maintain large-scale distributed training infrastructure for Ads ML models. Develop tools and frameworks on top of the Ray platform. Build tools to debug, profile, and tune distributed training jobs for performance and reliability. Integrate with object storage systems and improve data access patterns. Collaborate with ML engineers to improve model training time, efficiency, and GPU training costs. Drive improvements in scheduling, state management, and fault tolerance.

$185,800–$260,100
USD/year
US Unlimited PTO

Design, build, and maintain large-scale distributed training infrastructure for Ads ML models. Develop tools and frameworks on top of the Ray platform. Build tools to debug, profile, and tune distributed training jobs for performance and reliability. Integrate with object storage systems and improve data access patterns.