Source Job

US

  • Lead the data engineering work for our research portal migration.
  • Design and build production-grade RAG pipelines over AGCO’s research archive.
  • Partner with the CTO, product team, and application developers to translate business requirements into sound data.

ETL ELT AWS

20 jobs similar to Data Engineer — AI & Data Infrastructure

Jobs ranked by similarity.

Europe

  • Build pipelines to load data from various systems into Dataiku via S3 or Snowflake.
  • Increase the robustness of existing production pipelines, identify bottlenecks, and set up a robust monitoring, testing processes, and documentation templates.
  • Build custom applications and integrations to automate manual tasks related to customer operations to help Product Operations / Support / SRE in their day-to-day activities

Dataiku is the Platform for AI Success, the enterprise orchestration layer for building, deploying, and governing AI. The world’s leading companies rely on Dataiku to operationalize AI and run it as a true business performance engine delivering measurable value.

$120,000–$150,000/yr
US Unlimited PTO

  • Help build scalable data solutions and streamline data ingestion.
  • Maintain high-quality databases that support our scientific and operational teams.
  • Optimize our data infrastructure to ensure efficient data access.

Funga is a public benefit corporation addressing the climate crisis by harnessing forest fungal networks. They are a team of passionate scientists and builders working to draw down at least three gigatons of carbon dioxide from the atmosphere by 2050.

US Europe

  • Become a trusted data and AI advisor to clients, helping them translate business questions into AI-ready data architectures.
  • Design and implement AI-optimized data platforms, including cloud data warehouses, ETL/ELT pipelines, and analytic layers.
  • Engineer modern ELT/ETL pipelines that handle structured, semi-structured, and unstructured data to support AI and analytics use cases.

Aimpoint Digital is a dynamic and fully remote data and analytics consultancy. They work alongside the most innovative software providers in the data engineering space to solve their clients' toughest business problems and believe in blending modern tools and techniques with tried-and-true principles to deliver optimal data engineering solutions.

Europe

  • Organize and structure data systems at both macro and micro levels, designing and implementing data architectures that support business goalsOptimize data pipelines for performance, reliability, and scalability
  • Design, build, and maintain scalable ETL/ELT pipelines with Airflow to process large-scale, complex datasets
  • Demonstrate ability to delivery of of  data products  useful for machine learning and AI research and development (data models, metadata and semantics)

Owkin is an AI company on a mission to solve the complexity of biology. It is building the first Biology Super Intelligence (BASI) by combining powerful biological large language models, multimodal patient data, and agentic software.

$135,500–$200,000/yr
US

  • Architect, design, implement, and operate end-to-end data engineering solutions using Agile methodology.
  • Develop and manage robust data integrations with external vendors and organizations (including complex API integrations).
  • Collaborate closely with Data Analysts, Data Scientists, DBAs, and cross-functional teams to understand requirements and deliver high-impact data solutions.

SmartAsset is an online destination for consumer-focused financial information and advice, whose mission is helping people make smart financial decisions, reaching over an estimated 59 million people each month. A successful $110 million Series D funding round in 2021 valued the company at over $1 billion.

$200,000–$240,000/yr
US Unlimited PTO

  • Lead and grow a team of data engineers responsible for SentiLink’s data platform and infrastructure.
  • Define and drive the technical vision for data ingestion, processing, storage, and serving systems.
  • Design and evolve scalable data pipelines (batch and real-time) to support product and data science use cases.

SentiLink provides identity and risk solutions. They empower institutions and individuals to transaction with confidence. They have grown quickly and are backed by world-class investors.

Europe

  • Lead Agent Development: Drive the development of Owkin’s Data Transformation Agent (DTA).
  • Orchestrate Data Workflows: Design, implement, and maintain complex data transformation workflows.
  • Ensure Code Excellence: Define and enforce robust engineering practices.

Owkin is an AI company on a mission to solve the complexity of biology. They are building the first Biology Super Intelligence (BASI) by combining powerful biological large language models, multimodal patient data, and agentic software.

Europe

  • Own the architecture and delivery of production-grade LLM systems and classical ML solutions.
  • Design, evaluate, and optimize RAG pipelines (retrieval strategy, chunking, indexing, monitoring).
  • Build scalable, production-grade LLM services and agentic workflows, alongside traditional ML systems where appropriate.

Hiflylabs is a team of 250+ data and tech enthusiasts based in Budapest. They focus on data engineering, data science, artificial intelligence and application development, working on a wide range of projects around the world. Hiflylabs values its people and is committed to nurturing their personal and professional development through a mentoring system.

$147,900–$203,000/yr
US 4w PTO

  • Design data transformation pipelines that convert raw health signals, user inputs, and third-party data into structured, queryable context
  • Architect event-driven ingestion using tools like Kinesis, EventBridge, and SQS - handling duplicate events, traffic spikes, and partial failures gracefully
  • Define flexible data models and schemas for a hierarchical health context ontology that evolves as new context types emerge

Oura is dedicated to empowering individuals to take ownership of their inner potential. They provide award-winning products that enable their global community to gain deeper insights into their readiness, activity, and sleep quality through the Oura Ring and its connected app, enhancing the health and lives of millions.

Europe Asia

  • Create innovative solutions for handling peta-bytes of data with billions of rows & joins.
  • Create real time and offline features generation pipelines to managing our data infrastructure to be reliable and fast!
  • Develop and productionize data pipelines for our ML models in both bare-metal and the cloud environment.

Kayzen is a mobile demand-side platform (DSP) dedicated to democratizing programmatic advertising. They enable leading apps, agencies, media buyers, and brands to run programmatic customer acquisition, retargeting, and brand performance campaigns through their self-serve and managed service options.

US

  • Design and implement core backend systems and integrations that power the product.
  • Design and build our data platform (orchestration, pipelines, developer workflows)
  • Create scalable systems for data ingestion, transformation, and access

Avantos is building the AI-native operating system for financial services, transforming fragmented data into a single, intelligent system. They power workflows, automation, and decision-making as a product-led, fast-moving team in AI, fintech, and infrastructure.

Europe

  • Develop and deploy LLM-based solutions and RAG architectures.
  • Contribute to the end-to-end lifecycle of AI features.
  • Integrate AI solutions into the company's cloud infrastructure.

Remote People is building the infrastructure to power borderless teams. Their technology handles global payroll, benefits, taxes, and compliance, enabling businesses to hire anyone anywhere compliantly. They are committed to building a global, diverse team representing different backgrounds, perspectives, and experiences.

$172,000–$254,000/yr
US Canada

  • Collaborate with product managers, data analysts, and machine learning engineers to develop pipelines and ETL tasks.
  • Establish data architecture processes and practices that can be scheduled, automated, replicated and serve as standards.
  • Manage individual Data Engineers to foster learning, growth and success at Doximity.

Doximity is transforming the healthcare industry with a mission to help every physician be more productive and provide better care for their patients. As medicine's largest network in the United States, they are committed to building diverse teams with an inclusive culture.

$35–$50/hr
Global

  • Design and implement LLM-powered application workflows
  • Architect retrieval-augmented generation pipelines
  • Collaborate with backend architects to integrate AI services into APIs

They are seeking a hands-on AI Engineer with deep expertise in Large Language Model integration and production AI systems. The company's culture sounds innovative and collaborative, focusing on building scalable and secure AI applications.

$195,000–$215,000/yr
US 12w maternity

  • Lead engineering teams to ship high-quality features and maintain the Grantmaker platform, owning timelines and quality.
  • Build and inspire a high-performing distributed engineering team, setting clear expectations and fostering a motivating environment.
  • Drive architectural consistency, cross-team collaboration, and operational excellence, shaping architecture and strategy.

Fluxx is a mission-driven business that operates a cloud platform, enabling the end-to-end grantmaking process for funders and doers. They are committed to building a team of outstanding individuals with diverse backgrounds and perspectives and do not specify employee numbers.

$180,000–$220,000/yr
US Unlimited PTO 14w maternity

  • Design, build, and maintain databases that power Hologram's operations.
  • Build and maintain ETL pipelines that move and transform data reliably.
  • Audit existing pipelines and data models, identify complexity, and refactor bad decisions.

Hologram is building the future of IoT connectivity, delivering internet access to millions of connected devices worldwide. They process over 5 billion transactions per month across their global infrastructure and values a fun, upbeat, and remote-first team united by their mission.

Global

  • Lead a team of engineers building Postscript’s Customer Data Platform from the ground up
  • Collaborate with product managers, data science, business stakeholders, designers, and other engineering teams to plan and execute work
  • Consistently provide ongoing technical mentorship and general professional development for your direct reports and other engineers at Postscript

Postscript gives e-commerce brands the tools they need to run a world-class SMS marketing program. They are a 100% remote organization with more than 250 employees, backed by Greylock, Y Combinator, and other top investors.

US

  • Manage a team of engineers and managers responsible for Doxel’s customer-facing application.
  • Be a technical leader, setting the vision for the full-stack architecture and partnering closely with product managers and designers.
  • Organize and grow your team to meet the needs of the business and help team members progress on their career paths.

Doxel brings computer vision and AI to construction, giving teams real-time visibility into progress, risk, and execution. They are backed by Insight Partners and Andreessen Horowitz and with a rapidly growing team of engineers, scientists, construction veterans, and Enterprise go-to-market teams.

$120,000–$160,000/yr
US

  • You will join a team of talented engineers working closely with Data Scientists to build and scale our next-generation Ad EnGage data pipeline.
  • You will work with large-scale datasets (hundreds of TBs to petabyte-scale systems) using a modern data stack centered on AWS, Airflow, dbt, and Snowflake.
  • You’ll contribute to building reliable, high-quality data pipelines and improving the performance, scalability, and observability of our data platform.

EDO is the TV outcomes company. Their leading measurement platform connects convergent TV airings to the ad-driven consumer behaviors most predictive of future sales. They are headquartered in New York City and Los Angeles with an office space in San Francisco and recognize the benefits of hybrid working.

$155,000–$165,000/yr
US

  • Design end-to-end Retrieval-Augmented Generation (RAG) architecture.
  • Define chunking strategies based on content type, semantic coherence, and use case requirements.
  • Build metadata schemas, tagging frameworks, and document structures to optimize retrieval precision.

Great Day Improvements is a direct-to-consumer provider of premium home improvement products. They have over 4,800 employees across 130 metropolitan markets throughout the U.S. and they continue to rank among the top home improvement companies nationwide.