Research Engineer, Search and Knowledge Post-Training

Anthropic

Remote regions

US

Salary range

$500,000–$850,000/yr

Benefits

Role Focus:

  • Advance the science and engineering to make Claude a trustworthy searcher by defining hypotheses and designing experiments.
  • Turn search post-training from a craft into a measurable science with cleanly isolated variables and reproducible signal.
  • This work sits at the intersection of reinforcement learning, retrieval, and evaluation, shaping Claude's behavior in evidence-based settings.

Key Responsibilities:

  • Own research direction end-to-end, from hypothesis formation to experiment design and training runs.
  • Build controlled experiment infrastructure to study environmental factors and design evaluations that distinguish genuine reasoning.
  • Drive optimization rigor through efficient experiment design and ablations, and set the team's experimental standards.
  • Collaborate with researchers across post-training, RL infrastructure, and product to translate model behavior into training signals.

Qualifications:

  • Must have an unusually rigorous, quantitative mindset and be an outstanding software engineer in Python.
  • Must have shipped real ML research repeatedly with a taste for worthwhile experiments and operate well with high autonomy.
  • Preferred experience includes hands-on RL with LLMs, background in search/retrieval/RAG, and experience in research-heavy environments.
  • Prior published research on LLMs, RL, retrieval, or calibration is a plus, as is experience with distributed training systems.

Anthropic

Anthropic creates reliable, interpretable, and steerable AI systems with a mission for AI to be safe and beneficial. The company is a quickly growing group of researchers, engineers, policy experts, and business leaders working collaboratively to build beneficial AI systems.

Apply for This Position