Similar Jobs
See allSenior AI Engineer
League
Canada
Python
MLOps
GCP
Safeguards Enforcement Analyst, Safety Evaluations
Anthropic
US
SQL
SOP
Data
Engineering Manager, AI
Headspace
US
Python
AWS
Generative AI
Senior Product Manager, AI Platform
Sayari
US
AI
ML
Product Management
AI Agent Architect, Customer Experience
Airtable
US
Prompt Engineering
APIs
In This Role, You Will:
- Lead the AI Evaluation team, owning staffing, coaching, performance management, and delivery of evaluation and testing frameworks.
- Manage the AI evaluation lifecycle — including pre-launch testing, simulation, and post-deployment health monitoring — ensuring alignment with governance standards and expectations.
- Create domain-specific evaluation tracks (e.g., Compliance & Risk, Bot Experience, Agent Experience) to assess AI quality from multiple perspectives.
To Thrive in This Role, You Have:
- 7+ years in AI/ML operations, quality, or evaluation with at least 2+ years of people leadership experience.
- Deep understanding of LLM behavior, prompt testing, and evaluation methodologies.
- Familiarity with human-in-the-loop frameworks and prompt testing tools.
Why This Role Matters:
- This role creates the execution layer between AI experimentation and operational reality — ensuring governance standards are consistently applied and AI systems are safe, fair, and high-performing in production.
- You’ll lead the teams that deliver the evaluation signals Operations relies on to trust every AI model deployed.
Chime
Chime is a financial technology company that believes everyone can achieve financial progress. They are a team of problem solvers, dreamers, and builders with one shared obsession: their members.