Jobs Similar to Senior AI Quality/Evaluation Engineer | TangerineFeed

Senior AI Quality/Evaluation Engineer

IDC 22 hours ago

$73,000–$104,390/yr

North America 3w PTO 1w paternity

Design and build the evaluation infrastructure that ensures the platform's AI systems produce accurate, well-sourced, high-quality responses
Build automated test suites that validate answer quality across agent pipeline changes
Develop regression detection systems that catch quality degradation before it reaches users

Python LLM Machine Learning

20 jobs similar to Senior AI Quality/Evaluation Engineer

Jobs ranked by similarity.

Senior AI Engineer

League 24 days ago

$111,888–$128,633/yr

Canada US

Design and build production-grade AI systems, including RAG pipelines, multi-step agents, and LLM-powered features.
Build comprehensive evaluation and observability frameworks to measure model accuracy, grounding, and quality drift.
Create production-quality Python services to wrap AI logic into secure microservices.

League, founded in 2014, is the leading healthcare consumer experience (CX) platform powered by AI, reaching over 63 million people globally. Payers, providers, and consumer health partners use League’s platform to deliver high-engagement healthcare solutions and improve health outcomes.

View details Similar jobs

Senior AI Engineer

Osano 20 days ago

US Unlimited PTO

Architect and deploy autonomous AI agents and multi-agent workflows.
Design strict-source-following Retrieval-Augmented Generation (RAG) systems.
Build scalable backend services using FastAPI.

Osano is an innovative B-Corporation focused on giving modern enterprises the ability to innovate quickly and earn customer trust by respecting data privacy and complying with consent guidelines. We are scaling fast with a multi-year runway and ambitious growth plans.

View details Similar jobs

Manager, AI Operations & Evaluation

Chime 9 days ago

$150,000–$208,000/yr

US

Lead the AI Evaluation team, owning staffing, coaching, performance management, and delivery of evaluation and testing frameworks.
Manage the AI evaluation lifecycle — including pre-launch testing, simulation, and post-deployment health monitoring — ensuring alignment with governance standards and expectations.
Create domain-specific evaluation tracks (e.g., Compliance & Risk, Bot Experience, Agent Experience) to assess AI quality from multiple perspectives.

Chime is a financial technology company that believes everyone can achieve financial progress. They are a team of problem solvers, dreamers, and builders with one shared obsession: their members.

View details Similar jobs

Applied AI Evaluation Scientist

Jump 24 days ago

US

Design and curate evaluation datasets for retrieval quality.
Measure retrieval quality using metrics like Recall@k, Precision@k, MRR, and NDCG@k.
Conduct systematic error analysis on AI/ML system outputs; build structured failure taxonomies.

Jump empowers financial advisors, firms, and clients to thrive in the age of AI by automating tasks like meeting prep and compliance. As a Series A company, Jump has raised $30M and grown to 100+ employees including leaders from top companies and schools, fostering a culture of velocity, world-class standards, direct communication, and kindness.

View details Similar jobs

Senior Machine Learning Engineer - Integrations (AU remote)

Canva 2 days ago

Australia

Design and optimise AI-ready tools and APIs that enable LLM platforms to reliably interact with Canva's design capabilities.
Build and maintain evaluation frameworks to systematically measure tool-use accuracy across platforms.
Experiment with LLM orchestration and agent architectures – Develop Canva agents that any 3rd party provider can call to design quickly, efficiently and at scale.

Canva is a platform redefining how the world experiences design. They have a flagship campus in Sydney, with a second campus in Melbourne and co-working spaces in Brisbane, Perth, Adelaide, and Auckland, NZ.

View details Similar jobs

Senior AI Engineer

Senior AI Engineer 25 days ago

$43–$50/hr

Global

Co-create evaluation frameworks, proctoring solutions, and critical security mechanisms.
Evaluate and implement state-of-the-art AI/ML techniques.
Design, build, and deploy scalable AI services and pipelines.

The company develops and scales a global Certification-as-a-Service platform that automates the entire lifecycle of professional exams. The solution enables companies and organizations to quickly and inexpensively create and administer official online exams/certifications.

View details Similar jobs

AI Product Engineer

RevenueCat 5 days ago

5w PTO

Build and ship agentic features across the RevenueCat universe
Design and implement tool integrations that expand what agents can see and do
Own the reliability and quality of agent responses

RevenueCat is a monetization platform for mobile apps, helping developers understand and grow their revenue by removing the headaches of building and scaling in-app subscriptions. They are a remote-first company of 120+ employees across 25 countries, valuing customer obsession and continuous improvement.

View details Similar jobs

Staff AI Systems Engineer — Agentic Platforms

Kindo 25 days ago

$210,000–$260,000/yr

US

You will define, build, and evolve foundational systems that enable autonomous agents to operate reliably in production.
You’ll explore new approaches, prototype quickly, and turn what works into durable platform foundations.
You’ll identify high-leverage architectural improvements, abstractions, and guardrails that expand what the platform can do while keeping it reliable, secure, observable, and maintainable under real-world conditions.

Kindo is an agent automation platform for DevOps and SecOps teams, helping organizations automate high-friction operational work using autonomous agents. They are a small, highly technical team with strong customer traction and real enterprise revenue, where engineers have direct ownership over critical systems.

View details Similar jobs

Staff LLM Interaction Engineer

N8n 22 days ago

Europe US 3w PTO

Architect and implement AI-assisted workflow building capabilities.
Design systems that provide LLMs with the right context: workflow state, user intent, constraints, and history.
Build tool-driven agent behaviour: calling internal tools, validating outputs, correcting mistakes, and recovering gracefully

n8n is the open workflow orchestration platform built for the new era of AI. They give technical teams the freedom of code with the speed of no-code, so they can automate faster, smarter, and without limits. Since their founding in 2019, they’ve grown into a diverse team of over 220 working across Europe and the US, connected by a shared builder spirit and with their centre of gravity in Berlin.

View details Similar jobs

Senior Machine Learning Engineer - Canva AI (AU remote)

Canva 17 days ago

Australia New Zealand

Building a truly flexible and scalable conversational AI platform.
Fine-tuning and evaluating LLM-based models to improve performance.
Contributing to platform engineering across both ML and backend systems.

Canva is a design platform that allows users to create social media graphics, presentations, posters, documents and other visual content. They have a campus in Sydney, and a second campus in Melbourne and co-working spaces in Brisbane, Perth, Adelaide, and Auckland, NZ.

View details Similar jobs

Applied AI Engineer

Social Discovery Group 3 days ago

Global 6w PTO

Build and ship AI-powered product features using LLMs and generative models
Develop and maintain services and APIs around ML models
Integrate AI models into production systems and user-facing applications

Social Discovery Group (SDG) is the 3rd largest social discovery company in the world, uniting 60+ brands with 500 million users. They transform virtual intimacy into the new normal by solving the problems of loneliness, isolation, and disconnection. Their international team of 1200 professionals and digital nomads works all over the world.

View details Similar jobs

AI Language Engineer

Cresta 4 days ago

$90,000–$160,000/yr

US Unlimited PTO

Design, develop, and refine large language model workflows to steer and improve model behaviors.
Build language processing components for intent detection, summarization and conversational response quality.
Drive R&D-style exploration on cutting-edge speech and language systems, rapidly prototyping novel approaches.

Cresta's platform combines AI and human intelligence to help contact centers discover customer insights and behavioral best practices, automate conversations, and empower team members. They are led by founders with experience at Google, Waymo, and Open AI, and are on a mission to revolutionize the workforce with AI.

View details Similar jobs

AI Prompt Engineer

Brainscape 29 days ago

$40–$100/hr

Global

Migrate and test existing bulk flashcard creation prompts.
Run test suites and manually review AI outputs for quality and correctness.
Analyze real user data to identify failure patterns and improve prompts.

Brainscape is the world's leading web & mobile EdTech study platform. They help millions of learners create better flashcards and the company is looking for an AI Prompt Engineer to join their team.

View details Similar jobs

Senior AI Systems Engineer — Agentic Platforms

Kindo 26 days ago

$170,000–$220,000/yr

US

You will design, build, and operate core systems that enable autonomous agents to function reliably in production.
You’ll build production-grade agentic workflows, retrieval and memory systems, multi-model execution, and tool-calling integrations that interact safely with enterprise systems.
You’ll explore new approaches, prototype quickly, and turn what works into durable production systems.

Kindo is an agent automation platform for DevOps and SecOps teams. They help organizations automate high-friction operational work using autonomous agents. Kindo is a small, highly technical team with strong customer traction and real enterprise revenue.

View details Similar jobs

Machine Learning Engineer

Weave 5 days ago

US

Design and Develop machine learning infrastructure, tooling, and models to help teams deliver world class experiences.
Help product and development teams understand the data lifecycle and the inherent experimental nature of machine learning.
Build internal products and platforms to enable teams to incorporate AI into their features and customer facing products.

Weave provides an all-in-one platform for small businesses to streamline communications, and patient experiences. The company has a phenomenal culture, and Weave's teams are cross-functional agile teams composed of a product owner, backend and frontend devs and devops.

View details Similar jobs

Lead AI Engineer

Webflow 15 days ago

$204,500–$290,000/yr

US

Serve as the primary AI engineering partner to the CEO and executive leadership team, translating ideas into production-ready AI agents.
Independently take ideas from concept to production, shaping problem statements and operationalizing solutions.
Develop production-grade AI systems using modern LLMs, with strong attention to scalability and clean engineering practices.

Webflow is building the world’s leading AI-native Digital Experience Platform as a remote-first company. Their mission is to bring development superpowers to everyone and empower teams to design, launch, and optimize for the web without barriers.

View details Similar jobs

Applied AI Engineer

Human Agency 19 days ago

US Canada

Ship AI-powered products and tools from zero to production.
Architect systems that scale beyond demos.
Work across the full stack.

Human Agency partners with organizations of all sizes to explore, design, and implement AI strategies that are secure, scalable, and human-centered. They are scaling rapidly and have a growing pipeline of opportunities that demand exceptional talent across disciplines.

View details Similar jobs

Staff Machine Learning Engineer

Canva 2 days ago

Australia

Drive the design and evolution of AI-ready tools and APIs for LLM platforms.
Own and evolve evaluation frameworks that measure tool-use accuracy across platforms.
Shape Canva's agent architecture, making strategic technical decisions about intelligence location.

Canva is a design platform that enables users to create various visual content. They have offices in multiple locations in Australia and New Zealand, and they offer a flexible work environment.

View details Similar jobs

Senior AI Engineer

Finom 3 days ago

Europe

Build and ship AI-powered product and internal solutions using LLMs, RAG, tool calling, workflows, and agentic patterns
Design quality and evaluation frameworks for AI systems, including offline evals, online signals, failure analysis, and continuous improvement loops
Contribute to AI platform and tooling decisions that improve reuse, speed, and consistency across teams

Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial landscape for entrepreneurs. They develop an all-in-one financial B2B solution integrating banking, accounting, financial management, and invoicing into a mobile-first platform and nurture innovation in an inspiring work environment.

View details Similar jobs

QA Engineer, Open Source AI

Backblaze 14 days ago

US

Execute structured test plans across open source repositories, sample applications, SDK extensions, and AI workflow integrations.
Perform functional, integration, and regression testing on frameworks, applications, notebooks, scripts, APIs, and reference implementations.
Validate reproducibility of AI workflows in Jupyter and Google Colab environments.

Backblaze is the object storage leader in the open cloud movement, fueling customer success with cloud storage built purposefully to unlock budgets, unburden administrators, and unleash innovators. Founded in 2007, they scaled the business and today generate over $136M ARR managing over three billion gigabytes of data storage for 500K+ customers in 175+ countries.

View details Similar jobs