Similar Jobs

See all

Role Overview:

  • This role sits at the core of frontier AI data operations, owning how high-quality evaluation datasets and benchmarks are designed, validated, and delivered.
  • You will translate ambiguous evaluation needs into structured, high-signal data proposals and production-ready sample packages.

Key Accountabilities:

  • Own design, development, and delivery of AI evaluation data initiatives.
  • Develop data proposals and sample packages based on lab requests and benchmarks.
  • Define and enforce rigorous quality control frameworks.

Requirements:

  • 5+ years in technical program management, data operations, quality engineering, or ML evaluation.
  • Proven experience with AI labs or enterprise ML teams.
  • Strong understanding of LLM evaluation concepts.

Jobgether

Jobgether uses AI-powered matching to connect candidates with roles quickly and fairly. They are a remote-first company that shares top-fitting candidates with hiring partners.

Apply for This Position