Job Description
The responsibilities are:
- Design and build infrastructure for researchers to rapidly iterate on reward signals.
- Develop automated systems to assess the quality of rewards and detect reward hacks.
- Build pipelines and workflows to reduce toil in reward development.
Required skills include:
- Strong Python skills and experience with ML workflows.
- Comfort working across the stack, from data pipelines to user-facing tooling.
- Ability to balance building robust systems with moving quickly in research.
The job also includes:
- Optimizing systems for performance, reliability, and ease of use.
- Contributing to best practices and documentation for reward development.
- Collaborating with researchers to translate science requirements into platform capabilities.
About Anthropic
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems to be safe and beneficial for users and society.