What to Expect:
- 4 months of full-time research
- Direct mentorship from Anthropic researchers
- Access to a shared workspace (in either Berkeley, California or London, UK)
Mentors & Research Areas:
- Mentors will lead projects in select AI safety research areas
- Scalable Oversight, Adversarial Robustness and AI Control, Model Organisms
- Model Internals / Mechanistic Interpretability, AI Welfare
Unique Candidate Criteria:
- Motivated by reducing catastrophic risks from advanced AI systems
- Experience with empirical ML research projects
- Experience working with large language models
Anthropic
Anthropic's mission is to create reliable, interpretable, and steerable AI systems. Their team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.