Remote Data Jobs · Spark

Job listings

$123,696–$254,667/yr

  • Design and implement robust data infrastructure in AWS, using Spark with Scala.
  • Evolve our core data pipelines to efficiently scale for our massive growth.
  • Store data in optimal engines and formats, matching your designs to our performance needs and cost factors.

tvScientific is the first and only CTV advertising platform purpose-built for performance marketers. Our solution combines media buying, optimization, measurement, and attribution in one, efficient platform. Our platform is built by industry leaders with a long history in programmatic advertising, digital media, and ad verification.

$90,000–$120,000/yr
US 4w PTO

  • Design, build, and maintain scalable data pipelines using Python, Spark, and Airflow.
  • Collaborate cross-functionally with AI/ML and Product teams to implement new features.
  • Proactively identify and resolve bottlenecks in our complex ETL processes.

Sayari provides judgment infrastructure for trustworthy AI in economic security and commercial risk. They resolve primary-source records forming the ground truth of global commerce, and are headquartered in Washington, D.C., with offices in London, Singapore, Tokyo, and Tel Aviv.

  • Be the Analytics Engineering lead within the Sales and Marketing organization.
  • Be the data steward for Sales and Marketing: architect and improve the collection of underlying data.
  • Develop and maintain robust data pipelines and workflows for data ingestion, processing, and transformation.

Reddit is a community of communities, built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet. With 100,000+ active communities and millions of daily active unique visitors, Reddit is one of the internet’s largest sources of information.

US Unlimited PTO

  • Maintain, improve, and extend an AI platform already running in production.
  • Handle a mix of backend development, data pipelines, DevOps, and infrastructure work.
  • Translate business and product requirements into technical decisions independently.

Provectus is an AI consultancy and solutions provider. We help businesses adopt AI technologies, offering development and integration services. While the job posting doesn't mention company size information, they seem to foster a flexible, autonomous, and tech-forward culture.

  • Owns organizational-wide data architecture, defining standards, patterns, and designs that our teams will implement.
  • Reviews data-related designs and implementations across teams for architectural consistency, performance, and scalability.
  • Designs and develops data pipelines, integrations, and platform features with performance and scalability in mind.

Tenna provides a platform that revolutionizes construction equipment fleet operations. They provide innovative solutions to customers looking for competitive ways to better manage and track their assets, such as heavy and light equipment, large fleets, tools, and materials. They value quality-obsessed, gritty, continuous learners, and collaborative problem solvers.

$190,000–$280,500/yr

  • Architect and evolve scalable data ingestion and egress frameworks and pipelines that are well tested and offer strong data quality monitoring.
  • Architect and evolve our CI/CD processes - enhancing the testing environment and observability.
  • Enhance our Claude Code / LLM development support capabilities - creating tools / skills / agents that give our LLMs more context and help us continually improve their abilities to debug, create code, and maintain systems.

Life360’s mission is to keep people close to the ones they love. They have a mobile app, tracking devices, and a pet GPS tracker. Life360 has more than 500 (and growing!) remote-first employees and delivers peace of mind and enhances everyday family life.

$75,000–$110,000/yr
US 5w PTO

  • Support the architecture, design, and development of scalable analytics and reporting solutions across enterprise data platforms.
  • Partner with business stakeholders to define analytical strategies, frame problems, and deliver insights that drive decision-making.
  • Design and implement end-to-end data pipelines and workflows using modern big data and cloud technologies.

Cotiviti provides payment accuracy and analytics-driven solutions, focusing on healthcare and retail sectors. They are committed to fostering a diverse and inclusive environment where team members can grow and thrive.

Global Unlimited PTO

  • Champion a data-first approach across internal teams and client engagements, promoting clarity and impact
  • Build and deploy machine learning models to prevent fraud across diverse fintech use cases, from proof-of-concept through to production
  • Develop and track metrics to measure and monitor the performance of our risk products and the effectiveness of risk management strategies

Sardine is a leader in fraud prevention and AML compliance. Their platform uses device intelligence, behavior biometrics, machine learning, and AI to stop fraud before it happens. They have raised $145M from world-class investors and maintain a remote-first work culture.

$70,560–$81,120/yr

  • Enable efficient data access by creating and maintaining data pipelines.
  • Collaborate with ML engineers to design and maintain automation for machine learning training, quality assessment, and model release process.
  • Build data infrastructure from the vast amount of data for analytics, hypothesis testing and company metrics.

Eneba is building an open, safe, and sustainable marketplace for gamers. Their marketplace supports close to 20m+ active users and provides trust and safety.

  • Design, develop, and maintain data pipelines using Azure Databricks.
  • Build and optimize data transformations using PySpark and SQL in Databricks.
  • Implement and maintain Lakehouse architectures using Delta Lake.

Miratech helps visionaries change the world with enterprise and start-up innovation, supporting digital transformation for some of the world's largest enterprises. They are a values-driven organization with nearly 1000 full-time professionals and an annual growth rate exceeding 25%.