Senior Machine Learning Engineer, Ads Training Platform

Reddit πŸ’¬πŸ‘½πŸ‘

Remote regions

Europe

Benefits

Job Description

Design, build, and maintain large-scale distributed training infrastructure for Ads ML models. Develop tools and frameworks on top of the Ray platform. Build tools to debug, profile, and tune distributed training jobs for performance and reliability. Integrate with object storage systems and improve data access patterns. Collaborate with ML engineers to improve model training time, efficiency, and GPU training costs. Drive improvements in scheduling, state management, and fault tolerance within the training platform to enhance overall performance. The Ads Training Platform pod builds and maintains the distributed training and data processing infrastructure that powers Reddit’s Ads machine learning models. We focus on enabling fast, reliable, and scalable model training across large datasets, supporting the Ads ML teams in improving ad targeting, conversion prediction, and advertiser value.

About Reddit

Reddit is a community built on shared interests, passion, and trust and is home to open and authentic conversations.

Apply for This Position