Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows.
Develop full-stack tooling and backend services for large-scale data annotation, validation, and quality control.
Improve reliability, performance, and safety across existing Python codebases.
Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. They work on real production systems and high-impact research workflows across data, tooling, and infrastructure.
Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows
Develop full-stack tooling and backend services for large-scale data annotation , validation, and quality control
Improve reliability, performance, and safety across existing Python codebases
Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. We work on real production systems and high-impact research workflows across data, tooling, and infrastructure.
Lead and manage AI program execution and data creation projects.
Translate complex objectives into clear milestones and measurable impact.
Conduct regular spot checks and quality reviews of output to ensure data meets client standards.
SuperAnnotate is a fast-growing, Series B startup revolutionizing the field of AI-data Infrastructure. They specialize in providing cutting-edge data pipeline solutions for Machine Learning, LLM, and GenAI solutions to large enterprise clients, helping them leverage the power of AI to transform their businesses.
Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows
Develop full-stack tooling and backend services for large-scale data annotation , validation, and quality control
Improve reliability, performance, and safety across existing Python codebases
Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. They work on real production systems and high-impact research workflows across data, tooling, and infrastructure.
Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows
Develop full-stack tooling and backend services for large-scale data annotation , validation, and quality control
Improve reliability, performance, and safety across existing Python codebases
Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. They work on real production systems and high-impact research workflows across data, tooling, and infrastructure.
Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows
Develop full-stack tooling and backend services for large-scale data annotation , validation, and quality control
Improve reliability, performance, and safety across existing Python codebases
Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. They work on real production systems and high-impact research workflows across data, tooling, and infrastructure.
Design complex LLM prompts that accurately represent real customer journeys and service interactions.
Partner with Field Engineers to transform raw data into structured, high-quality tasks for model training.
Annotate and review tasks to ensure strict quality standards and alignment with expected customer outcomes.
Welo Data works with technology companies to provide datasets that are high-quality, ethically sourced, relevant, diverse, and scalable to supercharge their AI models.
Set program quality goals, roadmap, and operating rhythms.
Manage a team of Analysts and Coordinators; hire, coach, and run performance cycles.
Co-own client governance with Ops; align on scope, priorities, and changes.
Welo Data provides high-quality, ethically sourced, relevant, diverse, and scalable datasets to technology companies to supercharge their AI models. As a Welocalize brand, they bring together a global community of over 500,000 AI training and domain experts.
Set client QA strategies and adapt to scope/volume changes.
Run root-cause analyses; drive CAPA plans with owners, timelines, and effectiveness checks.
Plan training & certification for raters/annotators and coordinators; track completion and impact.
Welo Data provides high-quality, ethically sourced, relevant, diverse, and scalable datasets to technology companies to supercharge their AI models. As a Welocalize brand, WeloData leverages over 25 years of experience and brings together a curated global community of over 500,000 AI training and domain experts.
Engage with leading LLM labs to advance LLMs across STEM domain
Define and understand data quality rubric
Ship proactive data packs
Turing, based in San Francisco, is a research accelerator for frontier AI labs and a partner for global enterprises deploying advanced AI systems. The leadership team includes AI technologists from Meta, Google, Microsoft, Apple, Amazon, McKinsey, Bain, Stanford, Caltech, and MIT and is recognized by Forbes, The Information, and Fast Company among the world’s top innovators.
Design scalable, future-proof data platforms optimized for AI research workloads.
Build efficient self-serve data processing pipelines leveraging GCP's advanced services.
Implement guardrails for cost, quality, and performance.
AssemblyAI is at the forefront of Speech AI, creating powerful models for speech-to-text and speech understanding via an API. They're a remote team of startup veterans and AI researchers looking to build one of the next great AI companies.
Design and implement comprehensive evaluation frameworks that reflect real-world task success for agentic systems, with a focus on human+AI collaboration outcomes
Build benchmarking pipelines that capture nuanced success indicators including trust calibration, intervention frequency, and agent handoff quality
Collaborate with researchers, engineers, and product teams to align evaluation methodologies with business and user goals
Upwork is the world’s human and AI-powered work marketplace that connects businesses with highly skilled, AI-enabled independent talent from across the globe. From entrepreneurs to Fortune 100 enterprises, companies rely on Upwork’s trusted platform to find and hire expert talent. They have facilitated more than $25 billion in economic opportunity for talent around the world and their culture is built on trust, risk-taking, customer focus, and excellence.
This role validates Veeva AI Agents through evaluation. You will define strategies for new AI Agents. The role involves analysis of model behaviors to identify defects.
Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster.
Own the product lifecycle for AI-based decision-support tools, including roadmap planning, feature prioritization, and technical configuration
Serve as lead system prompt engineer, creating and maintaining instructions for LLM-based products
Collaborate with clients, internal teams, and stakeholders to conceptualize, test, and implement new features
Jobgether uses an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. The system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company.
Evaluate traditional, alternative, transactional, and raw datasets for use in underwriting, portfolio management, collections, and fraud.
Lead quantitative due diligence for M&A targets and data partnerships, assessing data quality, depth, coverage, stability, and scalability.
Design and implement validation frameworks to measure predictive lift, segmentation value, and incremental performance versus incumbent data.
Experian is a global data and technology company, powering opportunities for people and businesses around the world. As a FTSE 100 Index company listed on the London Stock Exchange (EXPN), they have a team of 22,500 people across 32 countries and corporate headquarters are in Dublin, Ireland.
Monitoring and improving quality assurance process ensuring any agreed-upon standards and procedures are followed.
Evaluating and identifying where enhancements in accuracy of models are required.
Netomi is the leading agentic AI platform for enterprise customer experience, working with the largest global brands to enable agentic automation at scale.