Remote Data Jobs · Spark

Job listings

$150,000–$210,000/yr
Unlimited PTO

As Vanna continues to grow, they’re looking for a Senior Data Engineer to build and mature the data platform capabilities that are foundational to Vanna’s success across several key domains. You’ll work closely with our engineering, product, operations, and clinical teams to develop the technology that enables our community-based teams to provide compassionate care to our members living with severe mental illness.

Mactores is seeking a highly skilled and innovative Spark Engineer to design, develop, optimize, and operationalize high-performance data pipelines and applications using Apache Spark. This role requires hands-on expertise in distributed data processing, ETL engineering, performance tuning, and cluster management.

We are looking for a Senior Data Engineer with a passion for using data to discover and solve real-world problems. You will enjoy working with rich data sets, modern business intelligence technology, and the ability to see your insights drive the features for our customers. This role will contribute to the development of policies, processes, and tools to address product quality challenges in collaboration with teams.

$190,000–$210,000/yr
Unlimited PTO

Build infrastructure for ingestion, transformation, and loading an exponentially increasing volume of data. The role includes building an organic entity resolution framework and developing CI/CD pipelines and anomaly detection systems. You will be dreaming up solutions to largely undefined data engineering and data science problems.

$160,000–$200,000/yr
Unlimited PTO

Be crucial in accelerating efforts to build standalone data products that enable data teams and independent developers to create innovative solutions at massive scale. Continuously improve existing datasets as well as pursuing new ones. Use and develop web crawling technologies to capture and catalog data on the internet. Support and improve our web crawling infrastructure.

Join the pricing and underwriting domain to bridge the gap between machine learning/data science and engineering. You will help build, publish, and maintain our complex data products and pipelines, key elements that have a significant impact on the company’s growth. Shape the architecture of data products designed for data analytics and data science.

Focusing on leading the design, development, optimization, and governance of enterprise-scale data platforms and pipelines on the Microsoft Azure cloud, this role requires a deep blend of technical expertise, architectural insight, and leadership skills. You will architect and implement scalable, secure, and cost-efficient cloud-native data solutions, primarily utilizing Azure Synapse Analytics.

Design, build, and maintain ETL/ELT pipelines in dbt and Scio that monitor and measure user exposure, engagement, and consumption across Spotify’s ecosystem. Collaborate with data experts from the Telemetry Squad, alongside data scientists and analytics specialists from the Epic Squad, to streamline data logging, user behavior instrumentation (UBI), and consumption reporting. Improve Spotify's data architecture by crafting user journey timelines and standardized schemas.

Develop and maintain scalable data pipelines using Scala and Apache Spark, focusing on performance and reuse. Structure and evolve the medallion architecture (bronze, silver, gold), ensuring governance, traceability, and data quality. Implement solutions for data ingestion, transformation, and delivery in cloud and lakehouse environments. Define and apply data engineering best practices, including versioning, automated testing, and CI/CD.