Design, build, and maintain distributed data pipelines that power Spotify Wrapped data stories and personalized experiences for more than 300M users globally.
Partner with Data Scientists to evaluate and operationalize new Wrapped story concepts, balancing personalization, scalability, and eligibility requirements.
Build scalable systems that process large-scale listening data and generate insights that celebrate users’ unique listening journeys.
Design, build, and ship agentic systems that ground personalized listening experiences in cultural context and world knowledge.
Develop and maintain pipelines for extracting, structuring, and serving cultural signals at scale, leveraging LLMs and agentic workflows.
Partner closely with teams across Personalization to integrate foundational cultural data and tech into new agentic listening experiences.
Spotify's Personalization team enhances the listening experience by creating features like Blend and Discover Weekly, using Generative AI to personalize music and podcast recommendations. The AI Foundation team, with around 100 AI/ML Engineers, Applied Research Scientists, Product Managers, and domain experts, builds the foundational data and tech for these personalized experiences.
Design, build, and maintain scalable data pipelines using AWS Glue (PySpark), or equivalent orchestration and transformation tools.
Engineer and optimise the ClickHouse warehouse for sub-second query performance across all back-offices.
Implement data contracts between back-office and the platform.
Block Labs is a premier technology studio operating at the bleeding edge of Web3, Artificial Intelligence, and iGaming. We are a collective of senior engineers, product strategists, and builders who refuse to compromise on architecture.
Design and implement robust data infrastructure in AWS, using Spark with Scala.
Evolve our core data pipelines to efficiently scale for our massive growth.
Store data in optimal engines and formats, matching your designs to our performance needs and cost factors.
tvScientific is the first and only CTV advertising platform purpose-built for performance marketers. Our solution combines media buying, optimization, measurement, and attribution in one, efficient platform. Our platform is built by industry leaders with a long history in programmatic advertising, digital media, and ad verification.
Design and build an integrated data platform, unifying existing tools and pipelines into a cohesive, scalable architecture.
Own data pipelines and SLAs end to end, ensuring reliable data movement between systems with clear expectations.
Shape the data strategy and platform roadmap, researching new technologies and introducing tools as the platform evolves.
Wrapbook is a vertical fintech platform that enables companies to seamlessly onboard, pay, and insure their workforces, operating in the entertainment industry. They are at an exciting stage of growth, having raised over 30M from investors like Andreessen Horowitz.
Design, build, and maintain scalable data pipelines
Develop and optimize ETL processes to support data products
Work with structured and unstructured data across SQL and NoSQL systems
They are seeking a Data Engineer to support the development of data products that power critical business functions. They seem to have a collaborative, cross-functional Agile environment where you'll partner closely with technical and business teams to deliver high-quality data solutions.
Craft fault-tolerant data pipelines and distributed systems to support millions of students.
Effective communicator who collaborates well with distributed engineering, product, and design teams.
Ensure that timely, accurate data and metrics are delivered consistently.
Renaissance is a global leader in pre-K–12 education technology. Their solutions help educators analyze, customize, and plan personalized learning paths for students. They are used in over one-third of US schools and in more than 100 countries worldwide.
Enable efficient data access by creating and maintaining data pipelines.
Collaborate with ML engineers to design and maintain automation for machine learning training, quality assessment, and model release process.
Build data infrastructure from the vast amount of data for analytics, hypothesis testing and company metrics.
Eneba is building an open, safe, and sustainable marketplace for gamers. Their marketplace supports close to 20m+ active users and provides trust and safety.
Design, build, and maintain scalable data pipelines using Python, Spark, and Airflow.
Collaborate cross-functionally with AI/ML and Product teams to implement new features.
Proactively identify and resolve bottlenecks in our complex ETL processes.
Sayari provides judgment infrastructure for trustworthy AI in economic security and commercial risk. They resolve primary-source records forming the ground truth of global commerce, and are headquartered in Washington, D.C., with offices in London, Singapore, Tokyo, and Tel Aviv.
Be the Analytics Engineering lead within the Sales and Marketing organization.
Be the data steward for Sales and Marketing: architect and improve the collection of underlying data.
Develop and maintain robust data pipelines and workflows for data ingestion, processing, and transformation.
Reddit is a community of communities, built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet. With 100,000+ active communities and millions of daily active unique visitors, Reddit is one of the internet’s largest sources of information.
Design and implement batch and real time ingestion pipelines from internal and external sources.
Implement automated data quality checks, observability, and SLA monitoring.
Optimise datasets and pipelines for analytics, ML training, and API consumption.
Software Mind develops solutions that make an impact for companies around the globe. They build cross-functional engineering teams that take ownership and crave more, always on the lookout for talented people who bring passion and creativity to every project.
Lead architecture and hands-on development of distributed systems supporting healthcare data workflows.
Design and implement scalable data pipelines for large-scale datasets.
Partner with Product and Data teams to translate healthcare requirements into scalable architectures.
Zeta Global is an AI-Powered Marketing Cloud that leverages advanced artificial intelligence (AI) and consumer signals, helping marketers acquire, grow, and retain customers efficiently. Founded in 2007, Zeta is headquartered in New York City with offices around the world, fostering a culture of trust and belonging.
Architect and evolve scalable data ingestion and egress frameworks and pipelines that are well tested and offer strong data quality monitoring.
Architect and evolve our CI/CD processes - enhancing the testing environment and observability.
Enhance our Claude Code / LLM development support capabilities - creating tools / skills / agents that give our LLMs more context and help us continually improve their abilities to debug, create code, and maintain systems.
Life360’s mission is to keep people close to the ones they love. They have a mobile app, tracking devices, and a pet GPS tracker. Life360 has more than 500 (and growing!) remote-first employees and delivers peace of mind and enhances everyday family life.
Build infrastructure and data automation pipelines to ingest, process, and load data from various sources.
Collaborate with stakeholders and data science teams to develop data products aligned with organizational goals.
Develop data analysis tools to provide insights and capture key metrics.
Columbia General is seeking a Senior Data Engineer to help transform data into actionable insights that drive decision-making. The company fosters a dynamic, collaborative environment that supports growth and innovation.
Work with large-scale datasets and production environments.
Build data pipelines, experimentation frameworks, and analytical solutions.
Use cloud environments and modern data infrastructure.
RYZ Labs is a startup studio founded in 2021 by two lifelong entrepreneurs. Their teams are remote and distributed throughout the US and Latam, using the latest cutting-edge cloud computing technologies to create scalable and resilient applications.
Build, maintain, and run CI/CD pipelines and infrastructure-as-code for the Smile Digital Health platform.
Provision, configure, and operate cloud-based Spark clusters and distributed data processing environments.
Design and maintain scalable, secure infrastructure templates and deployment automation across cloud environments.
Smile Digital Health makes it easy for healthcare stakeholders to collect and exchange data with our leading FHIR-based data liberation platform. At its heart, the Smile platform enables people and organizations to better manage healthcare data; the company was #19 on Deloitte's Technology Fast 50 Ranking for 2024!
Contribute to the design and implementation of scalable data solutions.
Build and optimize batch and streaming ingestion pipelines.
Ensure data quality, reliability, and performance across pipelines and datasets.
Blend is an AI services provider that co-creates impact for clients through data science, AI, technology, and people. They aim to fuel bold visions by aligning human expertise with artificial intelligence, fostering innovation, and unlocking value for their clients.
Build streaming and batch pipelines that ingest, normalise, and distribute market, trading, and portfolio data.
Build the self-serve tooling so other teams publish, consume, and build on data products without waiting.
Own data contracts and schema evolution; keep schema changes from turning into multi-team coordination events.
Keyrock is a change-maker in the digital asset space, renowned for its partnerships and innovation. They have over 250 team members around the world with diverse backgrounds and hubs in London, Brussels, and Singapore, hosting regular online and offline hangouts.
Query and process large datasets using Trino (SQL).
Work with data in AWS environment using PySpark.
Build audience segments based on website activity, call data, behavioral patterns and segment rules.
Kyivstar is one of the largest and most beloved telecom companies in Ukraine. They offer opportunities to work with large-scale real-world data in a friendly and collaborative team environment, with possibilities for professional development and career growth.
Own and maintain data pipeline architectures, ensuring reliability and monitoring.
Manage and evolve data modeling environments for analysts and engineers.
Implement observability for data systems, detecting issues early and continuously monitoring data quality.
Voltus unlocks the full value of distributed energy resources for customers and the grid. They are a fast-growing climate-tech company with a bright, gritty, and good team that values innovation, impact, and integrity.
Design, develop, and maintain robust and scalable data pipelines using Apache Spark and cloud-native data services.
Build, optimize, and support ETL/ELT workflows to enable analytics, reporting, and downstream applications.
Implement and manage data solutions using Databricks, Delta Lake, and Unity Catalog.
Onebridge, a Marlabs Company, is a global AI and Data Analytics Consulting Firm that empowers organizations worldwide to drive better outcomes through data and technology. Since 2005, they have partnered with some of the largest healthcare, life sciences, financial services, and government entities across the globe.