Similar Jobs
See allData Team Member
Poolside
Europe
Python
LLM
Prompt Engineering
Research Intern
Cohere
Global
Machine Learning
NLP
Python
Data Scientist (Mid-Senior)
Alpha7X
US
Python
NLP
Machine Learning
Data Scientist
Synthesia
Europe
Python
Machine Learning
Data Engineering
Senior Software Engineer ML Platform
Stack AV
US
Python
C++
Airflow
AI Team:
- Build Reddit-native foundational Large Language Models (LLMs).
- Sit at the intersection of applied research and massive-scale infrastructure.
- Tasked with training models that truly understand the Reddit culture.
Responsibilities:
- Architect and implement high-throughput, deterministic data sampling systems.
- Formulate and validate statistical hypotheses regarding data mixtures.
- Design the 'Safety-First' ingestion layer for PII redaction and toxicity signals.
Qualifications:
- 8+ years of software engineering experience with a focus on machine learning infrastructure.
- Expert proficiency in Python and distributed data processing frameworks.
- Strong mathematical foundation in probability, statistics, and importance sampling theory.
Reddit is a community-driven platform where users submit, vote, and comment on what interests them. With over 100,000 active communities and 116 million daily active users, they foster open conversations and shared interests.