Source Job

Canada

  • Architect Spark-driven workflows at scale and design data platforms as products for internal teams.
  • Develop and maintain end-to-end data pipelines and backend ingestion workflows across multiple sources.
  • Champion Samsara's cultural principles and mentor junior team members to drive data-driven decisions.

Spark Python SQL AWS Data Pipelines

20 jobs similar to Senior Data Engineer

Jobs ranked by similarity.

Canada

  • Design, build, and operate high-scale data ingestion and replication systems from production data stores into the data lakehouse.
  • Build and maintain reliable, scalable data platform infrastructure capable of handling petabytes of data across analytics, AI, and operational use cases.
  • Develop internal libraries, APIs, frameworks, and tooling in languages such as Go and Python to help teams move and access data safely.

Samsara is the pioneer of the Connected Operations Cloud, enabling organizations that depend on physical operations to harness IoT data for actionable insights. As a publicly traded company, Samsara fosters a growth-oriented culture and serves industries that represent over 40% of global GDP.

US Unlimited PTO

  • Lead and manage a global data engineering team building large-scale data pipelines and production datasets for the Public Investor business.
  • Collaborate with product, research, and operations teams to translate roadmap priorities into scalable technical plans and customer-facing data feeds.
  • Drive operational excellence through data quality frameworks, observability, and AI-assisted development practices.

YipitData is the leading market research and analytics firm for the disruptive economy, providing actionable insights from alternative data. With over $475M raised and offices globally, it has a people-centric culture recognized as a Best Workplace for three consecutive years.

Europe

  • Design and deliver end-to-end data platforms for analytics, BI, machine learning and AI-ready data products
  • Build and optimise scalable ETL/ELT pipelines with Databricks, Spark/PySpark, SQL and Python
  • Apply data quality, governance and security standards across the platform and mentor engineers

Tieto Tech Consulting provides design-led, data-centric, and AI-powered digital engineering & consulting services to enterprises worldwide. They focus on diversity, equity, and inclusion, fostering an inspiring workplace with a global team.

Europe

  • Design, build, and maintain scalable data lake solutions and processing pipelines handling large volumes of data.
  • Develop distributed data processing applications using Apache Spark on Databricks and build real-time streaming pipelines with Apache Kafka.
  • Apply software engineering best practices to data pipelines including CI/CD, automated testing, and peer code review.

InPost is an e-commerce parcel delivery company that operates a network of Automated Parcel Machines (APMs) and pick-up points across nine European countries. Founded in 1999, the company employs thousands and fosters a diverse, international, and cross-functional culture with opportunities for growth and training.

Global Unlimited PTO

  • Architect and maintain cloud-native data platforms (AWS, Snowflake, Databricks) supporting batch and streaming use cases.
  • Design and automate ETL/ELT workflows, optimize data models, and enable self-serve analytics and AI.
  • Manage end-to-end data lifecycles including ingestion, storage, processing, and delivery of structured and unstructured data.

Trustonic makes smartphones affordable for the many, enabling global access to devices and digital finance through secure smartphone locking technology. They partner with mobile carriers, retailers, and financiers across 30+ countries, and pride themselves on a diverse, inclusive culture that values doing the right thing for each other, the community, and the planet.

$123,696–$254,667/yr
US

  • Design and implement robust data infrastructure in AWS, using Spark with Scala.
  • Evolve our core data pipelines to efficiently scale for our massive growth.
  • Store data in optimal engines and formats, matching your designs to our performance needs and cost factors.

tvScientific is the first and only CTV advertising platform purpose-built for performance marketers. Our solution combines media buying, optimization, measurement, and attribution in one, efficient platform. Our platform is built by industry leaders with a long history in programmatic advertising, digital media, and ad verification.

Europe

  • Build and scale data infrastructure powering targeting, identity, and measurement capabilities.
  • Optimize core ETL/ELT pipelines and ensure operational reliability with documented SLAs.
  • Implement privacy-compliant data methodologies meeting GDPR/CCPA standards.

Kargo creates powerful moments of connection between brands and consumers to build businesses. With 600+ employees and offices across the US, UK, Australia, and Ireland, they take a creative science approach to deliver unique ad experiences across premium platforms.

Canada

  • Work with large data sets and implement sophisticated data pipelines with both structured and semi-structured data.
  • Collaborate with stakeholders to design scalable solutions and manage internal data pipelines.
  • Define data governance policies and leverage AI tools to streamline data pipeline development.

For over four decades, PAR Technology Corporation has been a leader in restaurant technology, empowering brands worldwide to create lasting connections with their guests. With over 100,000 restaurants in more than 110 countries, we embrace a 'Better Together' ethos and offer comprehensive software and hardware solutions.

US Canada

  • Build, maintain, and scale data pipelines integrating internal and external data into the warehouse.
  • Partner with internal stakeholders and engineering teams to understand analysis needs and improve data logging.
  • Participate in architectural decisions and evangelize data engineering best practices.

OXIO is the world’s first telecom-as-a-service platform, democratizing telecom for brands and enterprises to own proprietary mobile networks. The company is a rapidly growing startup with a diverse and inclusive team.

  • Design, build, and maintain scalable data pipelines using AWS Glue (PySpark), or equivalent orchestration and transformation tools.
  • Engineer and optimise the ClickHouse warehouse for sub-second query performance across all back-offices.
  • Implement data contracts between back-office and the platform.

Block Labs is a premier technology studio operating at the bleeding edge of Web3, Artificial Intelligence, and iGaming. We are a collective of senior engineers, product strategists, and builders who refuse to compromise on architecture.

US

  • Lead the design and evolution of the data platform architecture, establishing patterns and standards the team builds on.
  • Build and operate production-grade data pipelines that ingest and transform high-variance, real-world clinical data reliably and at scale.
  • Contribute to quarterly data product releases, working closely with product, clinical, and customer success teams to meet commitments.

Verantos is the market leader in high-accuracy real-world evidence (RWE) generation. The Verantos RWE platform integrates heterogeneous real-world data sources and generates evidence with the accuracy necessary for regulatory and reimbursement use, serving some of the largest biopharma companies globally.

US

  • Design and build scalable cloud data pipelines for high-volume manufacturing and IoT data using Spark, Kafka, Airflow, and Delta Lake.
  • Implement medallion/lakehouse architectures on Databricks, Snowflake, AWS, or Azure with strong SQL and Python proficiency.
  • Apply manufacturing domain expertise in MES, SCADA, ERP, and industrial protocols to bridge OT/IT systems for real-time data extraction.

We are a Digital Product Engineering company that builds products, services, and experiences that inspire, excite, and delight. We have 17000+ experts across 39 countries and our culture is dynamic and non-hierarchical.

Global

  • Lead data architecture, pipeline development, and data integrations on a generative AI platform to automate enterprise workflows.
  • Design and implement multi-zone enterprise data lakes on AWS S3 with batch and streaming ingestion pipelines.
  • Develop and deploy ML models on AWS SageMaker for use cases like lead scoring and predictive maintenance.

Capnexus is a comprehensive services provider specializing in designing, building, and supporting retail software. The company follows a build-as-a-service model with a culture built on outcomes and delivery, employing outstanding professionals across various platforms and verticals.

Canada

  • Architect and lead the implementation of an enterprise lakehouse on Databricks across major clouds.
  • Design scalable batch and streaming data pipelines using PySpark, Spark SQL, and Delta Live Tables.
  • Define and enforce platform standards for data modeling, CI/CD, governance, and cost optimization.

Bounteous is a premier end-to-end digital transformation consultancy partnering with ambitious brands to create digital solutions. With over 4,000 expert team members across the Americas, APAC, and EMEA, we deliver innovative solutions in Strategy, Analytics, Digital Engineering, Cloud, Data & AI, Experience Design, and Marketing.

US

  • Design, build, and optimize large-scale data and analytics platforms on the Databricks Lakehouse.
  • Architect and maintain scalable ETL/ELT pipelines using PySpark, Spark SQL, and Delta Lake.
  • Implement medallion data architectures, enforce data quality, and manage Unity Catalog for governance.

Bounteous is a premier end-to-end digital transformation consultancy that partners with ambitious brands to create digital solutions. With over 4,000 expert team members across the Americas, APAC, and EMEA, they deliver innovative strategies and technical expertise.

US Unlimited PTO

  • Build and operate production-grade ingestion pipelines from clinical, operational, and third-party systems into a Databricks lakehouse.
  • Develop and maintain dbt models to transform raw data into clean, documented, analytics-ready datasets.
  • Establish data quality, testing, and monitoring practices to ensure pipeline reliability and performance.

Zócalo Health is a tech-enabled, community-oriented primary care organization serving underserved populations with culturally competent care. Founded in 2021, the company is backed by leading healthcare investors and is scaling rapidly with a focus on value-based care.

Global

  • Design and implement modern data platforms and scalable data pipelines to enable better data-driven decisions.
  • Develop and maintain ETL/ELT pipelines using SQL, Spark/PySpark, and Microsoft Fabric or Databricks.
  • Work closely with data architects, BI developers, and customer stakeholders in an Agile environment.

Tieto, through MentorMate, creates durable technical solutions that deliver digital transformation at scale by blending strategic insights and thoughtful design with brilliant engineering. The company provides its people with the opportunity to work on impactful, global projects for recognizable brands.

Global

  • Architect and implement scalable ETL and data pipelines for real-time risk management and advanced analytics.
  • Design, develop, and optimize distributed data storage solutions for high performance and reliability at scale.
  • Drive schema evolution, data modeling, and pipeline orchestration with ownership of end-to-end data flow.

Oscilar builds the most advanced AI Risk Decisioning™ Platform for banks, fintechs, and digitally native organizations to manage fraud, credit, and compliance risk. The company is mission-driven with a remote-first culture and team members from Meta, Uber, Citi, and Confluent.

India 5w PTO 26w maternity 2w paternity

  • Design, build, and launch sophisticated data models and visualizations supporting multiple products.
  • Optimize pipelines, frameworks, and systems for easier development of data artifacts.
  • Collaborate with cross-functional teams and embody core values such as ownership and customer focus.

Outreach provides the only complete agentic AI platform for revenue teams. The company is used by world leading enterprises like Databricks, SAP, Siemens, and Verizon and promotes a culture of diversity and inclusion.

US East Coast 4w PTO

  • Own day-to-day administration, configuration, and health of Oura's global Databricks environment.
  • Contribute to data pipeline development and Spark workload optimization across cross-functional growth areas.
  • Manage workspace governance including access controls, cluster policies, cost monitoring, and security configurations.

Oura empowers people to own their inner potential through award-winning products that help gain deeper knowledge of readiness, activity, and sleep quality. They are a quickly growing company focused on helping people live healthier and happier lives, ensuring team members have what they need to do their best work.