Source Job

$195,000–$308,000/yr
US Unlimited PTO 12w maternity

  • Design and implement comprehensive evaluation frameworks that reflect real-world task success for agentic systems, with a focus on human+AI collaboration outcomes
  • Build benchmarking pipelines that capture nuanced success indicators including trust calibration, intervention frequency, and agent handoff quality
  • Collaborate with researchers, engineers, and product teams to align evaluation methodologies with business and user goals

Python SQL AI Machine Learning

20 jobs similar to Sr Lead Machine Learning Engineer

Jobs ranked by similarity.

US

  • Design, develop, and maintain a robust platform to enable users to create and manage AI agents and their interactions.
  • Integrate and work with multiple LLMs, ensuring seamless orchestration and scalability for both individual and coordinated agent operations.
  • Leverage orchestration frameworks like LangGraph and others to build complex workflows and pipelines that support diverse agent functionalities, including frameworks for multi-agent coordination.

ClickUp is building the future of work by creating a converged AI workspace that unifies tasks, docs, chat, calendar, and enterprise search. Their AI-powered platform helps teams break free from silos and unlock new levels of productivity.

$85,000–$225,000/yr
US Canada

This role validates Veeva AI Agents through evaluation. You will define strategies for new AI Agents. The role involves analysis of model behaviors to identify defects.

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster.

$230,000–$300,000/yr
US

  • Design, develop, and deploy agentic AI solutions for clients.
  • Build multi-agent systems and integrate models with enterprise systems.
  • Collaborate with clients and engineers to create scalable solutions.

AHEAD builds platforms for digital business, weaving together advances in cloud infrastructure, automation, analytics, and software delivery to help enterprises deliver on digital transformation. They prioritize creating a culture of belonging where all perspectives are valued and heard.

US North America

  • Design complex LLM prompts that accurately represent real customer journeys and service interactions.
  • Partner with Field Engineers to transform raw data into structured, high-quality tasks for model training.
  • Annotate and review tasks to ensure strict quality standards and alignment with expected customer outcomes.

Welo Data works with technology companies to provide datasets that are high-quality, ethically sourced, relevant, diverse, and scalable to supercharge their AI models.

$187,000–$250,000/yr
US

  • Own and execute a strategic roadmap for AI research, messaging, and context capabilities.
  • Enhance Apollo's AI research agents to surface actionable insights from the web.
  • Define how AI understands each user's business, transforming generic AI outputs into relevant recommendations.

Apollo.io is the leading go-to-market solution for revenue teams, trusted by over 500,000 companies and millions of users globally.

Build resilient AI Agents using LangGraph and microservices. Develop complex automation workflows in n8n. Collaborate with Internal Business Analysts to focus on coding, not guessing requirements.

At Gcore, you’ll help design and deliver that foundation for an AI-driven world, being a global provider of infrastructure and software solutions for AI, cloud, network, and security.

US

  • Designing, developing, and deploying generative AI models.
  • Architecting and building agentic systems with autonomous decision-making capabilities.
  • Integrating generative AI and agentic solutions into existing products and services.

Jobgether is a partner company that focuses on connecting talent with the right job opportunities. Their AI-powered matching process ensures applications are reviewed quickly, objectively, and fairly against the role's core requirements.

North America Canada

  • Lead domain-specific model optimization using PEFT (LoRA/QLoRA) and knowledge distillation to balance cost, latency, and reasoning capability.
  • Build next-gen Retrieval-Augmented Generation pipelines using hybrid search, cross-encoders, and self-correcting retrieval loops.
  • Design and deploy multi-agent systems using frameworks like LangGraph or CrewAI, enabling autonomous task planning and tool-use (Function Calling).

ServiceNow is a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500®. Their intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and better ways to work.

  • Evaluate AI model outputs related to artists, performers, and athletes.
  • Develop prompts for AI models reflecting your field of expertise.
  • Deliver feedback to strengthen the model’s understanding of workplace tasks and language.

Handshake is recruiting Agents and Business Managers of Artists, Performers, and Athlete Professionals to contribute to an hourly, temporary AI research project.

$150,000–$220,000/yr
US Unlimited PTO

  • Incorporating the best research work on agents and code generation into the OpenHands framework
  • Performing novel improvements in areas of interest to improve agent performance and efficiency
  • Running and implementing evaluations to ensure agent quality

OpenHands is building an open-source AI platform that empowers engineering teams to accelerate development, automate workflows, and integrate intelligent coding assistance into real-world software delivery. The company fosters a culture built on kindness, candor, autonomy, and learning.

US UK

As a Principal Decision Scientist, you will define high-level business objectives directly with clients, then develop and execute the project plan to meet those objectives. You will provide technical leadership to guide development work across teams while also owning and delivering specific technical components yourself. You will design and develop feature engineering pipelines, build ML & AI infrastructure, deploy models, and orchestrate advanced analytical insights.

Aimpoint Digital is a premier analytics consulting firm with a mission to drive business value for clients through expertise in data strategy, data analytics, decision sciences

$125,600–$157,000/yr
US

  • Design, build, and scale enterprise-grade AI/ML systems that power internal workflows and external-facing AI/ML platforms.
  • Develop a production-ready Generative AI and MLOps platform with reusable components used to deploy multiple AI solutions across Natera’s business units.
  • Implement cloud-native infrastructure for large-scale model training and serving using Kubernetes, MLflow, Terraform, and AWS-native services

Natera is a global leader in cell-free DNA (cfDNA) testing. They are dedicated to oncology, women’s health, and organ health, aiming to make personalized genetic testing and diagnostics part of the standard of care. The Natera team consists of highly dedicated statisticians, geneticists, doctors, laboratory scientists, business professionals, software engineers and many other professionals from world-class institutions.

US Unlimited PTO

  • Design, develop, and test AI agents to support business objectives and improve operational outcomes.
  • Integrate agents with enterprise data sources, APIs, and workflows to ensure seamless functionality.
  • Translate evolving AI capabilities into actionable business and sales use cases.

Highstreet is developing next-generation agentic AI solutions that empower public sector and education (SLED) clients to achieve real-world business outcomes. The company seems to have a modern, flexible workplace culture built for collaboration and growth.

  • Leverage professional experience to evaluate AI models' output in your field.
  • Assess content and deliver feedback to strengthen the model’s understanding.
  • Work independently from anywhere, with flexible hours and no minimum commitment.

Handshake is a recruiting platform. They connect students and recent graduates with employers.

  • Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows.
  • Develop full-stack tooling and backend services for large-scale data annotation, validation, and quality control.
  • Improve reliability, performance, and safety across existing Python codebases.

Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. They work on real production systems and high-impact research workflows across data, tooling, and infrastructure.

Latin America

  • Design, develop, and deploy automation workflows that reduce manual effort and improve accuracy across back-office functions.
  • Apply artificial intelligence, machine learning, and modern automation tools to solve complex operational challenges and unlock new efficiencies.
  • Collaborate with Client Operations, Finance, People, and other G&A teams to identify automation opportunities, gather requirements, and deliver solutions.

Engine is transforming business travel into something personalized, rewarding, and simple. They are building a platform that brings together corporate travel, a powerful charge card, and modern spend management in one place and more than 20,000 companies already rely on Engine.

Global

  • Evaluate AI model outputs in your field.
  • Assess content related to your field of work.
  • Deliver feedback to strengthen AI understanding.

Handshake is connecting students, new grads, and young professionals with job opportunities. They aim to close the opportunity gap and ensure everyone has equal access to meaningful employment.

US

  • Implement AI-enabled backend services within a secure, cloud-native microservices environment.
  • Design and scale Intelligent Document Processing (IDP) pipelines to extract and validate data from claims, authorizations, and medical documentation.
  • Integrate LLMs and intelligent agents into clinical and claims workflows to streamline patient and provider interactions.

EZ Labs is committed to transforming healthcare delivery through technology, innovation, and compassion. They partner with care teams, payers, and providers to improve how patients experience care. The company integrates advanced analytics and secure platforms to enable smarter decisions.