New Research Lead, Training Insights

Anthropic

Salary range

$850,000–$850,000/yr

Benefits

Similar Jobs

See all

Responsibilities:

  • Build new novel and long-horizon evaluations
  • Develop novel measurement approaches for understanding how model capabilities emerge and evolve during RL training
  • Lead strategic evaluation coverage across the company

You may be a good fit if you:

  • Have significant experience designing and running evaluations for large language models or similar complex ML systems
  • Have led technical projects or teams, either formally or through sustained ownership of critical research directions
  • Are equally comfortable designing experiments and writing code—you can move between research and implementation fluidly

Representative projects:

  • Designing and implementing a suite of long-horizon evaluations that test model capabilities on tasks requiring sustained reasoning, planning, and tool use over extended interactions
  • Building systems to track capability development across RL training checkpoints, surfacing insights about when and how specific capabilities emerge
  • Conducting a cross-org audit of evaluation coverage, identifying blind spots, and prioritizing new evaluations to fill critical gaps across Pretraining, RL, Inference, and Product

Anthropic

Anthropic's mission is to create reliable, interpretable, and steerable AI systems, ensuring AI is safe and beneficial for users and society. They are a growing group of researchers, engineers, policy experts, and business leaders committed to building beneficial AI systems.

Apply for This Position