Similar Jobs
See allSr Lead Machine Learning Engineer
Upwork
US
Python
SQL
AI
Data Engineer (LLM Data & Prompt Engineering)
Welo Data
US
SQL
Python
LLM
Staff AI Data Engineer
Rockerbox
Europe
AI
ML
Data Engineering
Senior Product Manager, AI Agents
Apollo.io
US
AI
ML
Python
Data Scientist, AI Engineering
Candidly
US
Python
LangGraph
PyTorch
Evaluation Strategy & Planning: Define and establish comprehensive evaluation strategies for new AI Agents focusing on test data set integrity. LLM Output Integrity Assessment: Manually and programmatically evaluate the quality of LLM-generated content against metrics. Creating High-Fidelity Datasets: Design and curate high-quality test data sets, including challenging prompts. The role includes automation of evaluation pipelines and root cause analysis.
Veeva Systems
Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster.