What to Expect:

  • 4 months of full-time research
  • Direct mentorship from Anthropic researchers
  • Access to a shared workspace (in either Berkeley, California or London, UK)

Mentors & Research Areas:

  • Mentors will lead projects in select AI safety research areas
  • Scalable Oversight, Adversarial Robustness and AI Control, Model Organisms
  • Model Internals / Mechanistic Interpretability, AI Welfare

Unique Candidate Criteria:

  • Motivated by reducing catastrophic risks from advanced AI systems
  • Experience with empirical ML research projects
  • Experience working with large language models

Anthropic

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. Their team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Apply for This Position