Source Job

US LATAM

  • Assist in building and maintaining data pipelines.
  • Support data cleaning, validation, and quality assurance.
  • Contribute to data modeling and preparation for analytics and AI.

Python SQL Data Structures Databases ETL

20 jobs similar to Data Engineer Intern

Jobs ranked by similarity.

Latin America

  • Design, build, and maintain scalable data pipelines using Python and Airflow
  • Develop and optimize ETL/ELT processes for structured and unstructured data
  • Collaborate with data science teams to support Machine Learning workflows

Oowlish is a rapidly expanding software development company in Latin America. They foster a nurturing work environment, are certified as a Great Place to Work, and provide opportunities for professional development and international impact.

$106,000–$120,000/yr
US

  • Lead the technical onboarding of partner institutions onto UDTS.
  • Design, build, and maintain scalable data pipelines and architectures.
  • Collaborate with team members to set engineering standards and guide data infrastructure strategy.

DataKind is a non-profit organization that uses data science and AI to address global challenges. They work with various sectors like health, humanitarian action, climate, economic opportunity, and education to create data-driven tools.

Europe

  • Develop engineering expertise within the Dataiku Platform to help maintain and develop system integrations, platform automations, and platform configurations.
  • Build & maintain python & SQL data replication & data pipelines on large & often complex data sets.
  • Identify opportunities for improvements & optimization for greater scalability & delivery velocity

Dataiku is the Platform for AI Success, the enterprise orchestration layer for building, deploying, and governing AI. The world’s leading companies rely on Dataiku to operationalize AI and run it as a true business performance engine delivering measurable value.

Europe

  • Build pipelines to load data from various systems into Dataiku via S3 or Snowflake.
  • Increase the robustness of existing production pipelines, identify bottlenecks, and set up a robust monitoring, testing processes, and documentation templates.
  • Build custom applications and integrations to automate manual tasks related to customer operations to help Product Operations / Support / SRE in their day-to-day activities

Dataiku is the Platform for AI Success, the enterprise orchestration layer for building, deploying, and governing AI. The world’s leading companies rely on Dataiku to operationalize AI and run it as a true business performance engine delivering measurable value.

$100,000–$140,000/yr
US

  • Design, build, and maintain scalable data pipelines for clients across industries.
  • Architect and optimize cloud data warehouse solutions, adapting to each client's stack.
  • Collaborate with analysts and data scientists to ensure data is clean, reliable, and well-modeled.

NuView Analytics helps companies accelerate the time to insights from their data through data analytics, diligence, and fractional data science. They are a growth-stage company looking to drive additional value from the data they are sitting on and value humility, intellectual rigor, and stewardship.

Latin America

  • Create and maintain optimal data pipeline architecture.
  • Assemble large, complex data sets that meet functional and non-functional business requirements.
  • Identify, design, and implement internal process improvements, automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability.

Coderoad is a software development company that provides end-to-end software development services. It provides an opportunity to work on exciting, real-world projects in a supportive environment, offering staff augmentation, dedicated IT teams, and general software engineering.

$172,000–$254,000/yr
US Canada

  • Collaborate with product managers, data analysts, and machine learning engineers to develop pipelines and ETL tasks.
  • Establish data architecture processes and practices that can be scheduled, automated, replicated and serve as standards.
  • Manage individual Data Engineers to foster learning, growth and success at Doximity.

Doximity is transforming the healthcare industry with a mission to help every physician be more productive and provide better care for their patients. As medicine's largest network in the United States, they are committed to building diverse teams with an inclusive culture.

Global

  • Design, build, and maintain scalable data infrastructure to support analytics and reporting across the organization.
  • Develop and operate ETL pipelines to ingest, transform, and deliver large-scale datasets.
  • Partner closely with Data Analysts and cross-functional stakeholders to provide reliable datasets and guide them in using data effectively.

Truelogic is a leading provider of nearshore staff augmentation services headquartered in New York. With over two decades of experience, they deliver top-tier technology solutions to companies of all sizes. Their team of 600+ highly skilled tech professionals, based in Latin America, drives digital disruption by partnering with U.S. companies in their projects.

$104,000–$164,000/yr
US

  • Build and manage business data pipelines and transform Firefox telemetry data into structured datasets.
  • Partner with data scientists, product, and marketing teams to turn datasets into models and metrics.
  • Ensure data accuracy and performance using observability tools and resolve data issues.

Mozilla Corporation is a technology company backed by a non-profit that has shaped the internet, creating brands like Firefox. With millions of users globally, they focus on areas including AI and social media while remaining focused on making the internet better for people.

$180,000–$200,000/yr
US

  • Lead the architecture and evolution of scalable, distributed data pipelines, ensuring high availability and performance at scale
  • Build and maintain distributed web scraping systems using tools such as Playwright, Selenium, and BeautifulSoup
  • Integrate AI and LLMs into engineering workflows for code generation, automation, and optimization

MercatorAI is building scalable data infrastructure to power high-quality, data-driven decision making at scale. As an early-stage company, the team is focused on creating robust, future-ready systems that can handle complex data ingestion, transformation, and delivery across a growing national footprint.

$179,469–$242,811/yr
US

  • Lead and grow a team of data engineers, providing mentorship and technical guidance.
  • Own execution of customer integrations across multiple product lines, ensuring on-time delivery.
  • Improve data quality and pipeline reliability by investing in better alerting and resilience.

Afresh is the leading AI company in fresh food, partnering with grocers to order billions of dollars of fresh food. They are on a mission to eliminate food waste and make fresh food accessible to all and has saved 200M lbs of food waste in 2025 alone.

US Europe

  • Become a trusted data and AI advisor to clients, helping them translate business questions into AI-ready data architectures.
  • Design and implement AI-optimized data platforms, including cloud data warehouses, ETL/ELT pipelines, and analytic layers.
  • Engineer modern ELT/ETL pipelines that handle structured, semi-structured, and unstructured data to support AI and analytics use cases.

Aimpoint Digital is a dynamic and fully remote data and analytics consultancy. They work alongside the most innovative software providers in the data engineering space to solve their clients' toughest business problems and believe in blending modern tools and techniques with tried-and-true principles to deliver optimal data engineering solutions.

US

  • Prepare and manage pre-stage files for backbook conversion activities.
  • Support and execute data ingestion tasks in alignment with scheduled project events, including key mock events.
  • Monitor and ensure data ingestion completion within defined SLA windows.

Kunai builds full-stack technology solutions for banks, credit and payment networks, infrastructure providers, and their customers. They help their clients modernize, capitalize on emerging trends, and evolve their business for the coming decades by remaining tech-agnostic and human-centered.

Global

  • Design, build, and maintain efficient data pipelines (ETL processes) to integrate data from various source systems into the data warehouse.
  • Develop and optimize data warehouse schemas and tables to support analytics and reporting needs.
  • Write and refine complex SQL queries and use scripting (e.g., Python) to transform and aggregate large datasets.

Deel is an all-in-one payroll and HR platform tailored for global teams. As one of the largest globally distributed companies, Deel's 7,000 team members span over 100 countries, fostering a dynamic culture of continuous learning and innovation.

$180,000–$220,000/yr
US Unlimited PTO 14w maternity

  • Design, build, and maintain databases that power Hologram's operations.
  • Build and maintain ETL pipelines that move and transform data reliably.
  • Audit existing pipelines and data models, identify complexity, and refactor bad decisions.

Hologram is building the future of IoT connectivity, delivering internet access to millions of connected devices worldwide. They process over 5 billion transactions per month across their global infrastructure and values a fun, upbeat, and remote-first team united by their mission.

Global

  • Assist in maintaining and improving our data warehousing systems (AWS DMS, Redshift).
  • Support the team in monitoring data pipelines, identifying issues, and troubleshooting basic problems.
  • Write and optimize SQL queries for analytics and data validation.

Sezzle aims to financially empower the next generation by blending tech with interest-free installment plans. They foster an innovative and dynamic team passionate about creating a unique shopping journey.

$120,000–$160,000/yr
US

  • You will join a team of talented engineers working closely with Data Scientists to build and scale our next-generation Ad EnGage data pipeline.
  • You will work with large-scale datasets (hundreds of TBs to petabyte-scale systems) using a modern data stack centered on AWS, Airflow, dbt, and Snowflake.
  • You’ll contribute to building reliable, high-quality data pipelines and improving the performance, scalability, and observability of our data platform.

EDO is the TV outcomes company. Their leading measurement platform connects convergent TV airings to the ad-driven consumer behaviors most predictive of future sales. They are headquartered in New York City and Los Angeles with an office space in San Francisco and recognize the benefits of hybrid working.

Brazil

  • Design and implement data ingestion and transformation pipelines using PySpark/SparkSQL on Databricks.
  • Own data pipelines end-to-end in production: freshness, correctness, availability, and SLA adherence.
  • Build and maintain Delta Lake tables following medallion architecture patterns.

Pismo, founded in 2016, provides a comprehensive processing platform for banking, card issuing, and financial market infrastructure. With over 500 employees across more than 10 countries and now part of Visa, they empower firms to build and launch financial products rapidly with high security and availability standards.

$127,000–$175,000/yr
US

  • Partner closely with business stakeholders to understand their challenges and design end-to-end architecture.
  • Design, develop, and own robust, efficient, and scalable data models in Snowflake and Iceberg using dbt and advanced SQL.
  • Build and manage reliable data pipelines and CI/CD workflows using tools like Airflow, Python, and Terraform.

Motive empowers people who run physical operations with tools to make their work safer, more productive, and more profitable. Motive serves nearly 100,000 customers and provides complete visibility and control across a wide range of industries.

US North America

  • Enable self-service analytics for all team members by designing clean, intuitive data models and metrics through dbt, empowering employees to make informed, data-driven decisions.
  • Develop and refine custom data pipelines that ingest data from operational systems to our analytics platform, handling both streaming and batch data using third-party tooling and home-grown solutions
  • Maintain and optimize the data platform infrastructure, focusing on data quality, ELT efficiency, and platform hygiene.

Auto Integrate makes leased vehicle maintenance frictionless for millions of customers in the USA and Canada. The business is managed by a small, global team within Fleetio, combining the resources of a scaled SaaS company with the agility of a niche market leader.