Source Job

US

  • Design and maintain data pipelines and auto-labeling systems to support ML model training from multimodal data.
  • Write and optimize SQL queries for data extraction, analysis, and ingestion from various sources.
  • Develop and prototype learning-based models using a data-centric approach with techniques like active learning and fine-tuning.

SQL Python TensorFlow PyTorch AWS

20 jobs similar to Data Scientist / Machine Learning Engineer

Jobs ranked by similarity.

Global

  • Lead data architecture, pipeline development, and data integrations on a generative AI platform to automate enterprise workflows.
  • Design and implement multi-zone enterprise data lakes on AWS S3 with batch and streaming ingestion pipelines.
  • Develop and deploy ML models on AWS SageMaker for use cases like lead scoring and predictive maintenance.

Capnexus is a comprehensive services provider specializing in designing, building, and supporting retail software. The company follows a build-as-a-service model with a culture built on outcomes and delivery, employing outstanding professionals across various platforms and verticals.

$120,000–$160,000/yr
US

  • Design, develop, and deploy AI/ML models to automate and improve internal workflow.
  • Build and maintain ML pipelines within an AWS cloud environment.
  • Integrate ML capabilities into existing Java and React application workflows.

Oddball aims to improve daily lives by delivering quality software to the federal space. With a team of experienced engineering, product, and UX professionals, we value learning, growth, and making a big impact in a rapidly growing company.

$145,000–$200,000/yr
US Unlimited PTO

  • Design and build ETL processes in collaboration with software and model development teams.
  • Create and maintain scalable data infrastructure.
  • Own full pipeline and infrastructure lifecycle including performance monitoring and optimization.

OpenTeams builds AI that empowers, with models that are energy-efficient, cost-effective, and fully yours. They are proponents of open source, reinvesting 3% of profits back into the open-source community and value freedom, teamwork, accountability, and uncompromising quality.

$140,000–$175,000/yr
US

  • Deploy new data pipelines.
  • Design & build data observability platforms and metrics.
  • Build metadata driven pipeline solutions.

Fuze Health puts patients first and tirelessly addresses the most pressing needs in healthcare. They empower millions to digitally connect with care providers, essential health resources and needed treatments. The company is built upon the strategic combination of several proven, technology-powered innovators in the digital health, diagnostics, and pharmacy sectors.

$86,400–$138,600/yr
US

  • Design, develop, and maintain scalable data pipelines and infrastructure.
  • Build and optimize data warehouses, databases, and data models.
  • Implement and maintain data governance and security practices.

Jobgether is a company that uses an AI-powered matching process to ensure applications are reviewed quickly, objectively, and fairly. They connect candidates with companies; their culture is collaborative and inclusive, focused on innovation and growth.

$65,705–$87,606/yr
Canada

  • Design, build, and maintain scalable data infrastructure using modern cloud technologies.
  • Develop robust batch and streaming data pipelines to ingest, process, and serve data.
  • Contribute to the implementation of a modern data lakehouse architecture.

Jobgether uses an AI-powered matching process to ensure applications are reviewed quickly, objectively, and fairly. The system identifies the top-fitting candidates and shares this shortlist with the hiring company.

US

  • Lead workspace architecture, Unity Catalog governance, and cluster policy design for client tenant organizations.
  • Perform tenant discovery, requirements gathering, source profiling, and security classification for new data intake requests.
  • Develop end-to-end technical designs for tenant onboarding, including Data Sharing Agreements and SLA documentation.

M9 Solutions provides IT services and solutions to the Federal Government, mobilizing skilled people and technologies for improved performance and sustainable change. With 15+ years of proven delivery and growth, the company has been recognized as an Inc. 5000 Fastest-Growing Private Company multiple times and values diverse perspectives.

US

  • Design, build, and operate data pipelines for analytics and AI/ML capabilities.
  • Architect ingestion, transformation, and storage pipelines across diverse data sources.
  • Implement data models suitable for analytics and BI consumption.

Jobgether uses an AI-powered matching process to ensure applications are reviewed quickly, objectively, and fairly. They identify the top-fitting candidates and share the shortlist directly with the hiring company.

Global

  • Design and build end-to-end data pipelines across the RAW, Silver, and Gold layers of the Medallion Architecture.
  • Architect data ingestion, transformation, standardization, and serving processes, that structure data flows from diverse and heterogeneous sources into a coherent analytical foundation.
  • Model data for analytical consumption following Data Warehouse best practices, including Star Schema design and dimensional modeling suited for business intelligence and AI-readiness.

CI&T is a tech transformation specialist, uniting human expertise with AI to create scalable tech solutions. With over 8,000 CI&Ters around the world, they’ve built partnerships with more than 1,000 clients during their 30 years of history, valuing diverse identities and life experiences.

Global 6w PTO

  • Development of various services in Python: integration with marketing partners, obtaining data from various sources.
  • Creation and support of processes on Airflow.
  • Supporting the migration of marketing data pipelines and DWH components from MS SQL to Google Cloud Platform (including BigQuery), contributing to architecture decisions and best practices.

Social Discovery Group (SDG) is one of the world's largest groups of social discovery companies, uniting millions of users on dozens of products. Our international team of 1000+ professionals and digital nomads works all over the world and we are proud to be a two-time “Great Place to Work” winner.

India

  • Design scalable data pipelines and backend systems from the ground up.
  • Leverage AWS and GCP for real-time and batch processing.
  • Manage databases and Data Warehouses, optimizing ETL workflows.

Delivery Solutions, a UPS company, is looking for a Senior Data Engineer to join their team. They are a growing company.

Global 4w PTO

  • Take ownership of the ML API serving NBA recommendations and harden it for low-latency production traffic.
  • Ship your first agent tool contract end-to-end: schema design, handler implementation, and unit tests.
  • Set up the eval foundation for agents with golden transcripts, rubric-based judges, and regression suites.

Clutch is a vertical SaaS company backed by Andreessen Horowitz that helps credit unions become fintech lenders, providing affordable lending solutions to over 130 million Americans. The team is small, ambitious, and shipping fast with a culture that values pragmatism and real autonomy.

$110,000–$125,000/yr
US Unlimited PTO 12w paternity

  • Design, develop, and maintain robust, scalable ETL/ELT data pipelines using Python, SQL, and data processing frameworks.
  • Implement data quality checks, monitoring, and alerting across all data pipelines to ensure data integrity and reliability.
  • Work closely with data analysts, data scientists, and business intelligence engineers to understand their data requirements and deliver reliable, high-quality data access.

InStride Health delivers specialty anxiety and OCD care. They focus on expanding access to insurance-based care, increasing engagement, and improving treatment outcomes by combining clinical care and innovative technology. They are a mission-driven company.

  • Design, build, and maintain scalable data pipelines using AWS Glue (PySpark), or equivalent orchestration and transformation tools.
  • Engineer and optimise the ClickHouse warehouse for sub-second query performance across all back-offices.
  • Implement data contracts between back-office and the platform.

Block Labs is a premier technology studio operating at the bleeding edge of Web3, Artificial Intelligence, and iGaming. We are a collective of senior engineers, product strategists, and builders who refuse to compromise on architecture.

Global

  • Query and process large datasets using Trino (SQL).
  • Work with data in AWS environment using PySpark.
  • Build audience segments based on website activity, call data, behavioral patterns and segment rules.

Kyivstar is one of the largest and most beloved telecom companies in Ukraine. They offer opportunities to work with large-scale real-world data in a friendly and collaborative team environment, with possibilities for professional development and career growth.

$100,649–$174,459/yr
US 4w PTO

  • Independently deliver analytical projects across the consumer credit lifecycle, including acquisition, account management and collections
  • Build statistical and machine learning models through all phases of development, from design through training, evaluation, validation and implementation
  • Use a broad set of technologies: SQL, PySpark, Python, AWS and more to obtain insights from large volumes of data

Experian is a global data and technology company, powering opportunities for people and businesses around the world. We operate across a range of markets, from financial services to healthcare, automotive, agribusiness, insurance, and many more. They have an amazing team of 25,200 people in 32 countries.

US

  • Develop and implement scalable AI/ML solutions for generative AI models including large language models and multimodal architectures.
  • Design multi-year vision and shape the direction of crucial generative AI areas such as text generation, image synthesis, and personalized content.
  • Partner with product management and stakeholders to identify use cases, analyze patterns, and maintain compliance in healthcare AI.

Aledade is a healthcare technology company that builds web applications and data pipelines to support primary care. They are a large organization with a culture focused on engineering excellence, observability, and incremental delivery.

LATAM

  • Design, build, and maintain scalable data pipelines
  • Develop and optimize ETL processes to support data products
  • Work with structured and unstructured data across SQL and NoSQL systems

They are seeking a Data Engineer to support the development of data products that power critical business functions. They seem to have a collaborative, cross-functional Agile environment where you'll partner closely with technical and business teams to deliver high-quality data solutions.

LATAM

  • Build and improve AI-native products and data-driven systems.
  • Rapidly prototype and iterate on AI-powered features.
  • Analyze, evaluate, and improve model outputs and system reliability.

FutureProofing is a talent platform focused on embedding high-caliber technical talent into startups building real AI-driven products. They work at the intersection of startup execution and real AI product development, helping companies build and improve production AI systems.

US

  • Design, deploy, and maintain scalable ML infrastructure supporting model training, batch inference, and real-time inference workloads.

National Debt Relief was founded in 2009 with the goal of helping consumers deal with overwhelming debt. They are one of the most-trusted and best-rated consumer debt relief providers in the United States, having helped over 450,000 people settle over $10 billion of debt.