Fellows Program — AI Safety

Anthropic 1 week ago

US UK Canada $3,850–$3,850/yr

What to Expect:

4 months of full-time research
Direct mentorship from Anthropic researchers
Access to a shared workspace (in either Berkeley, California or London, UK)

Mentors & Research Areas:

Mentors will lead projects in select AI safety research areas
Scalable Oversight, Adversarial Robustness and AI Control, Model Organisms
Model Internals / Mechanistic Interpretability, AI Welfare

Unique Candidate Criteria:

Motivated by reducing catastrophic risks from advanced AI systems
Experience with empirical ML research projects
Experience working with large language models

Anthropic

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. Their team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Apply for This Position