Source Job

US Unlimited PTO

  • Design and improve graph-based approaches for identity resolution across devices, households, and identifiers to improve match quality, coverage, and stability.
  • Use Scala, Spark, SQL, and cloud-native tools to analyze large identity datasets, build models, and productionize data science workflows.
  • Leverage LLMs, AI editors, and agentic workflows to accelerate research, prototyping, documentation, testing, and iteration.

Scala Spark SQL Python Machine Learning

20 jobs similar to Senior Data Scientist

Jobs ranked by similarity.

$123,696–$254,667/yr
US

  • Design and implement robust data infrastructure in AWS, using Spark with Scala.
  • Evolve our core data pipelines to efficiently scale for our massive growth.
  • Store data in optimal engines and formats, matching your designs to our performance needs and cost factors.

tvScientific is the first and only CTV advertising platform purpose-built for performance marketers. Our solution combines media buying, optimization, measurement, and attribution in one, efficient platform. Our platform is built by industry leaders with a long history in programmatic advertising, digital media, and ad verification.

Global

  • Design and deliver scalable, low-latency streaming data solutions for real-time customer analytics.
  • Analyze business needs, optimize data models, and write clean code using Scala, Python, and SQL.
  • Mentor team members and optimize performance of data platforms like AWS Kinesis, Kafka, and Redshift.

Aircall is an AI-powered customer communications platform used by 22,000+ companies worldwide, unifying voice, SMS, WhatsApp, and AI. The company is a unicorn backed by world-class investors, with 45+ nationalities and a strong, collaborative culture.

Canada

  • Be the Analytics Engineering lead within the Sales and Marketing organization.
  • Be the data steward for Sales and Marketing: architect and improve the collection of underlying data.
  • Develop and maintain robust data pipelines and workflows for data ingestion, processing, and transformation.

Reddit is a community of communities, built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet. With 100,000+ active communities and millions of daily active unique visitors, Reddit is one of the internet’s largest sources of information.

US EMEA

  • Design, build, and maintain distributed data pipelines that power Spotify Wrapped data stories and personalized experiences for more than 300M users globally.
  • Partner with Data Scientists to evaluate and operationalize new Wrapped story concepts, balancing personalization, scalability, and eligibility requirements.
  • Build scalable systems that process large-scale listening data and generate insights that celebrate users’ unique listening journeys.

The Personalization team makes deciding what to play next easier and more enjoyable for every listener. They are behind some of Spotify’s most-loved features. Join them and you’ll keep millions of users listening by making great recommendations to each and every one of them.

USA Unlimited PTO

  • Design, build, and scale systems for customer-facing data products and machine learning workloads.
  • Collaborate with product, engineering, and cross-functional partners on complex data challenges.
  • Implement data quality, observability, and governance frameworks for reliable data at scale.

Boulevard provides the first and only client experience platform for appointment-based self-care businesses. The company has an insatiable curiosity and embraces experimentation, with a diverse team that values open communication and equal opportunity.

Global 6w PTO

  • Development of various services in Python: integration with marketing partners, obtaining data from various sources.
  • Creation and support of processes on Airflow.
  • Supporting the migration of marketing data pipelines and DWH components from MS SQL to Google Cloud Platform (including BigQuery), contributing to architecture decisions and best practices.

Social Discovery Group (SDG) is one of the world's largest groups of social discovery companies, uniting millions of users on dozens of products. Our international team of 1000+ professionals and digital nomads works all over the world and we are proud to be a two-time “Great Place to Work” winner.

US Unlimited PTO

  • Lead and manage a global data engineering team building large-scale data pipelines and production datasets for the Public Investor business.
  • Collaborate with product, research, and operations teams to translate roadmap priorities into scalable technical plans and customer-facing data feeds.
  • Drive operational excellence through data quality frameworks, observability, and AI-assisted development practices.

YipitData is the leading market research and analytics firm for the disruptive economy, providing actionable insights from alternative data. With over $475M raised and offices globally, it has a people-centric culture recognized as a Best Workplace for three consecutive years.

Unlimited PTO

  • Partner with Product and Engineering to identify high-impact opportunities and define success metrics.
  • Lead rigorous experimentation across teams with hypothesis design, A/B testing, and clear readouts.
  • Build and iterate on ML/AI capabilities that ship to production, optimizing for value, latency, and cost.

Traackr is a global SaaS company providing a data-driven influencer marketing platform for marketers to optimize investments and scale programs. They are a remote-first company with offices in San Francisco, New York, Boston, Paris, and London, operating on a culture of mutual respect with core values including trust, diversity, value, ownership, and mutual success.

Canada US UK

  • Build and maintain end-to-end customer data infrastructure including event collection and pipelines.
  • Define instrumentation standards and enforce schema governance and data quality.
  • Lead identity resolution and attribution modeling across fragmented customer journeys.

The company is a fast-growing SaaS provider focused on customer data infrastructure. It offers a remote-first culture emphasizing autonomy, ownership, and trust, with a diverse and inclusive team.

Global Unlimited PTO

  • Partner with cross-functional teams to define high-impact product questions, success metrics, and decision frameworks.
  • Analyze user behavior, creator activity, and content ecosystems to identify opportunities and risks.
  • Design and analyze experiments, including A/B tests and causal measurement, to evaluate product launches and strategic initiatives.

VRChat offers a first-of-its-kind, game-changing platform that provides an endless collection of social VR experiences and gives the power of creation to its robust community. The company has raised around $100M to date and its team includes people from Netflix, Twitter, Meta, Microsoft, Roblox, Google, Amazon, Unity, Spotify, Discord, Uber, eBay, Robinhood, Twitch, Zynga and TikTok.

US Canada

  • Build, maintain, and scale data pipelines integrating internal and external data into the warehouse.
  • Partner with internal stakeholders and engineering teams to understand analysis needs and improve data logging.
  • Participate in architectural decisions and evangelize data engineering best practices.

OXIO is the world’s first telecom-as-a-service platform, democratizing telecom for brands and enterprises to own proprietary mobile networks. The company is a rapidly growing startup with a diverse and inclusive team.

$90,000–$120,000/yr
US 4w PTO

  • Design, build, and maintain scalable data pipelines using Python, Spark, and Airflow.
  • Collaborate cross-functionally with AI/ML and Product teams to implement new features.
  • Proactively identify and resolve bottlenecks in our complex ETL processes.

Sayari provides judgment infrastructure for trustworthy AI in economic security and commercial risk. They resolve primary-source records forming the ground truth of global commerce, and are headquartered in Washington, D.C., with offices in London, Singapore, Tokyo, and Tel Aviv.

Global

  • Architect and scale data systems for analytics, ML/AI products, reporting, and APIs.
  • Own the full data lifecycle including ingestion, transformation, modeling, validation, and serving.
  • Partner with Data Science to productionize models and build reliable data foundations for AI-driven products.

Vidmob is the creative data company that provides scoring software and analytics to help marketers and agencies drive business results through improved creative effectiveness. They partner with top marketers and agencies worldwide and operate the industry's most robustly instrumented human-reinforcement learning model for creativity.

Global Unlimited PTO

  • Architect and maintain cloud-native data platforms (AWS, Snowflake, Databricks) supporting batch and streaming use cases.
  • Design and automate ETL/ELT workflows, optimize data models, and enable self-serve analytics and AI.
  • Manage end-to-end data lifecycles including ingestion, storage, processing, and delivery of structured and unstructured data.

Trustonic makes smartphones affordable for the many, enabling global access to devices and digital finance through secure smartphone locking technology. They partner with mobile carriers, retailers, and financiers across 30+ countries, and pride themselves on a diverse, inclusive culture that values doing the right thing for each other, the community, and the planet.

United States

  • Build and improve scalable, fault-tolerant, self-serve data infrastructure technologies to support ML and analytics workflows.
  • Own the Data Movement Platform for batch and stream data processing, and invest in building new infrastructure for Spark, Flink, and Airflow.
  • Collaborate with teammates on on-call responsibilities and monitoring/alerting to improve reliability, scalability, latency, and efficiency.

Reddit is a community of communities built on shared interests, passion, and trust, hosting the most open and authentic conversations on the internet. With over 100,000 active communities and approximately 126 million daily active unique visitors, Reddit is one of the internet's largest sources of information.

United States

  • Lead the design and evolution of scalable financial data systems supporting commissions, incentives, and payments.
  • Build and maintain robust data pipelines using Python, SQL, Spark, and Terraform for accuracy and performance.
  • Define technical strategy and roadmap for financial data operations in collaboration with stakeholders.

Our partner is a fast-growing technology company building financial data infrastructure for insurance operations. They have a remote-friendly work environment and emphasize engineering excellence and cross-functional collaboration.

US

  • Play a crucial role in helping client organizations transform raw data into reliable, well-modeled assets that drive business decisions.
  • Design, build, and maintain scalable data pipelines and ELT workflows, with Databricks as the primary platform.
  • Collaborate with data engineers, analysts, and clients on end-to-end data requirements and project delivery.

Velir is an established mid-sized agency with a top-tier portfolio of clients, ranging from the world’s largest non-profits to Fortune 500 brands. Our culture is built on a foundation of trust, collaboration, and continued improvement, and we are a remote first company that offers competitive pay and excellent benefits.

UK

  • Own end-to-end analytical problems from framing through modeling and stakeholder rollout.
  • Build predictive and inferential models to improve product and operational decisions.
  • Partner with Product, Engineering, and Operations to define metrics and drive insights.

Focal Systems is the industry leader in retail AI solutions, using deep learning computer vision to automate and optimize brick and mortar retail. They are a tight-knit team with an ambitious mission, deployed at scale with top retailers worldwide.

  • Design, build, and maintain scalable data pipelines using AWS Glue (PySpark), or equivalent orchestration and transformation tools.
  • Engineer and optimise the ClickHouse warehouse for sub-second query performance across all back-offices.
  • Implement data contracts between back-office and the platform.

Block Labs is a premier technology studio operating at the bleeding edge of Web3, Artificial Intelligence, and iGaming. We are a collective of senior engineers, product strategists, and builders who refuse to compromise on architecture.

US

  • Develop machine learning and AI solutions for forecasting, anomaly detection, and operational intelligence.
  • Design scalable enterprise data architectures and build ETL/ELT pipelines for analytics and AI workloads.
  • Partner with stakeholders to define metrics, lead technical reviews, and mentor team members.

DataDirect Networks (DDN) is a global market leader in AI and high-performance data storage innovation, powering many of the world's most demanding AI data centers across industries like life sciences, healthcare, financial services, and research. With a proven track record of performance and scalability, DDN fosters a culture of innovation, customer-centricity, and passionate professionals committed to excellence.