Source Job

Europe 5w PTO

  • Improve model performance through data quality, curation, labeling, and evaluation.
  • Work on the data layer of Generative AI products involving images, video, or audio.
  • Design, build, and operate workflow orchestration systems and large-scale data processing pipelines.

Python Machine Learning Data Engineering Generative AI Data Processing

20 jobs similar to Data Scientist

Jobs ranked by similarity.

  • Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows.
  • Develop full-stack tooling and backend services for large-scale data annotation, validation, and quality control.
  • Improve reliability, performance, and safety across existing Python codebases.

Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. They work on real production systems and high-impact research workflows across data, tooling, and infrastructure.

Global

  • Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows
  • Develop full-stack tooling and backend services for large-scale data annotation , validation, and quality control
  • Improve reliability, performance, and safety across existing Python codebases

Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. We work on real production systems and high-impact research workflows across data, tooling, and infrastructure.

$141,487–$184,800/yr
Europe

  • Design scalable, future-proof data platforms optimized for AI research workloads.
  • Build efficient self-serve data processing pipelines leveraging GCP's advanced services.
  • Implement guardrails for cost, quality, and performance.

AssemblyAI is at the forefront of Speech AI, creating powerful models for speech-to-text and speech understanding via an API. They're a remote team of startup veterans and AI researchers looking to build one of the next great AI companies.

  • Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows
  • Develop full-stack tooling and backend services for large-scale data annotation , validation, and quality control
  • Improve reliability, performance, and safety across existing Python codebases

Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. They work on real production systems and high-impact research workflows across data, tooling, and infrastructure.

Looking for young talent ready to go all in. Offering significant equity to people who want to build something that matters. Define the future of AI in influencer marketing.

Influur is redefining how advertising works through creators, data, and AI, aiming to make influencer marketing as measurable, predictable, and scalable as paid ads.

Europe

  • Apply bleeding edge AI theory to the design and implementation of large-scale data systems that feed AI agents and autonomous workflows.
  • Use data science techniques to fine-tune, evaluate, and optimize LLMs for marketing-specific tasks.
  • Build end-to-end automations using LLMs, internal data, and external signals to eliminate repetitive human tasks.

Rockerbox is building the next generation of marketing intelligence. They are looking for someone to help them build the AI systems everyone else just theorizes about.

Europe Unlimited PTO

Design, implement, and maintain scalable ETL/ELT pipelines using Python, SQL, and modern orchestration frameworks. Build and optimize data models and schemas for cloud warehouses and relational databases, supporting AI and analytics workflows. Lead large-scale data initiatives from planning through execution, ensuring performance, cost efficiency, and reliability.

This position is posted by Jobgether on behalf of a partner company.

  • Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows
  • Develop full-stack tooling and backend services for large-scale data annotation , validation, and quality control
  • Improve reliability, performance, and safety across existing Python codebases

Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. They work on real production systems and high-impact research workflows across data, tooling, and infrastructure.

US

  • Engage with leading LLM labs to advance LLMs across STEM domain
  • Define and understand data quality rubric
  • Ship proactive data packs

Turing, based in San Francisco, is a research accelerator for frontier AI labs and a partner for global enterprises deploying advanced AI systems. The leadership team includes AI technologists from Meta, Google, Microsoft, Apple, Amazon, McKinsey, Bain, Stanford, Caltech, and MIT and is recognized by Forbes, The Information, and Fast Company among the world’s top innovators.

US

  • Design and maintain data models that organize rich content into canonical structures optimized for product features, search, and retrieval.
  • Build high-reliability ETLs and streaming pipelines to process usage events, analytics data, behavioral signals, and application logs.
  • Develop data services that expose unified content to the application, such as metadata access APIs, indexing workflows, and retrieval-ready representations.

Udio's success hinges on hiring great people and creating an environment where we can be happy, feel challenged, and do our best work.

North America Asia Unlimited PTO

  • Design, implement, and maintain distributed ingestion pipelines for structured and unstructured data.
  • Build scalable ETL/ELT workflows to transform, validate, and enrich datasets for AI/ML model training and analytics.
  • Support preprocessing of unstructured assets for training pipelines, including format conversion, normalization, augmentation, and metadata extraction.

Meshy is a leading 3D generative AI company transforming content creation by enabling the creation of 3D models from text and images. They have a global team distributed across North America, Asia, and Oceania and are backed by venture capital firms like Sequoia and GGV, with $52 Million in funding.

Europe

  • Design and deliver data access patterns and ingestion workflows.
  • Work with technical roles and domain experts to turn AI use cases into production-ready capabilities.
  • Build lightweight ingestion or sync pipelines to bring priority datasets from various source systems into platforms that enable AI use.

RWS is building the next generation of AI-enabled capabilities across our products, internal production systems, and enterprise platforms.

US Global

  • Design, train, and refine large-scale 3D generative models from covering pre-training, post-training, and emerging paradigms in diffusion.
  • Bridge the gap between cutting-edge research and product, deploy models in real products used by millions of creators, using human feedback and creative evaluation.
  • Create novel model architectures to make 3D generation faster, higher-quality, and more controllable.

Meshy believes 3D creation should be boundless and accessible. They built a full pipeline for 3D content ranging from text / image to 3D, texturing, texture editing, animation rigging, etc. They are the market leader in 3D generative AI with a vibrant community.

US North America

  • Design complex LLM prompts that accurately represent real customer journeys and service interactions.
  • Partner with Field Engineers to transform raw data into structured, high-quality tasks for model training.
  • Annotate and review tasks to ensure strict quality standards and alignment with expected customer outcomes.

Welo Data works with technology companies to provide datasets that are high-quality, ethically sourced, relevant, diverse, and scalable to supercharge their AI models.

$85,000–$225,000/yr
US Canada

This role validates Veeva AI Agents through evaluation. You will define strategies for new AI Agents. The role involves analysis of model behaviors to identify defects.

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster.

Europe

  • Prototype, iterate, and ship algorithms to production in close collaboration with Product, Data Engineering, and Software teams.

Mirakl provides eCommerce software solutions that enable enterprises to drive growth and efficiency in their online business. With over 350 employees in France and offices in 7 countries, Mirakl is considered a Great Place to Work company that is pioneering the platform economy.

$104,000–$156,000/hr

  • Design, build, and optimize high-performance systems in Python supporting AI data pipelines and evaluation workflows
  • Develop full-stack tooling and backend services for large-scale data annotation , validation, and quality control
  • Improve reliability, performance, and safety across existing Python codebases

Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models. They work on real production systems and high-impact research workflows across data, tooling, and infrastructure.

Australia New Zealand

  • Act as a solution expert across ML domains including evaluations, training, inference, data pipelines, quality, and optimisation.
  • Work directly alongside product teams as a trusted partner, helping them navigate technical challenges and arrive at effective solutions.
  • Develop blueprints, patterns, and paved roads that allow other teams to follow proven approaches and accelerate their own implementations.

Canva is a design platform that enables users to create professional designs. They have a flagship campus in Sydney, a second campus in Melbourne, and co-working spaces in other locations, with a flexible work environment.

Europe

  • Train, test, and ship models that power Peec AI’s recommendations.
  • Develop algorithms that extract actionable insights from AI search behavior.
  • Own the full model lifecycle from experimentation to production deployment.

Peec AI helps customers boost their visibility in AI search. They are a fast-growing Series A startup in Europe.

Global

  • Define the vision and feature set for our internal Robotics Data Platform.
  • Act as the Product Owner for a dedicated team of software engineers.
  • Participating in discussions with leading robotics labs and foundation model builders.

Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises looking to deploy advanced AI systems. Recognized by Forbes, The Information, and Fast Company among the world’s top innovators, Turing’s leadership team includes AI technologists from Meta, Google, Microsoft, Apple, Amazon, McKinsey, Bain, Stanford, Caltech, and MIT.