Source Job

US Unlimited PTO 12w maternity 6w paternity

  • Design, implement, and evaluate reinforcement learning algorithms for robotic control and motion planning.
  • Develop sim-to-real pipelines using simulation environments like Isaac Gym and MuJoCo.
  • Collaborate with cross-functional teams to deploy RL policies on physical robots.

Python PyTorch TensorFlow Robotics

20 jobs similar to Senior Machine Learning Engineer, Reinforcement Learning

Jobs ranked by similarity.

US

  • Develop and improve AI/ML solutions for autonomous driving and future sensing objectives.
  • Apply techniques like unsupervised pre-training, imitation learning, and reinforcement learning to perception problems.
  • Collaborate cross-functionally to integrate models into production and evaluate sensing tradeoffs across real-world scenarios.

General Motors is redefining mobility through human-centered design, creating vehicles and experiences that aim to make driving safer, smarter, and more connected. With a global footprint and a culture focused on innovation and inclusion, GM’s diverse team brings collective passion for engineering, technology, and design to deliver on a vision of Zero Crashes, Zero Emissions, and Zero Congestion.

  • Manage the full deployment lifecycle for assigned retail locations.
  • Configure robots for new sites including traversal pattern setup, schedule management, and route mapping.
  • Build scripts, workflow automations, or lightweight internal tools to reduce manual effort in the deployment process.

Simbe Robotics is a leading retail robotics company providing in-store intelligence solutions. They help retailers optimize operations, improve shelf execution, and deliver valuable data insights. Simbe's culture is dynamic, inclusive, and driven by a passion for improving the way retailers operate.

$194,000–$228,000/yr
US

  • Design, build, and ship LLM-powered features and agentic workflows for Gametime users.
  • Build and maintain evaluation frameworks and prompt testing pipelines for AI-powered experiences.
  • Contribute to orchestration layer, including agent routing, tool use, and multi-step workflow coordination.

Gametime helps people connect through shared live experiences. They operate platforms on iOS, Android, mobile web, and desktop, supporting over 60,000 events across the US and Canada, fostering a collaborative and inclusive environment where diverse perspectives are valued.

US

  • Own the full ML lifecycle, taking projects from ideation to production, including feature engineering, model selection, deployment, and model observability and evaluation.
  • Translate business needs into ML solutions, gathering product requirements and translating them into robust ML system design requirements.
  • Build recommendation and ranking systems, architecting and launching ranking and recommendation infrastructure from scratch, initially via integrated off-the-shelf models, and evolving to targeted and customized solutions in the long term.

Affinity's Relationship Intelligence platform empowers dealmakers to find, manage, and close more deals. It has more than 3,000 customers worldwide and is backed by Silicon Valley firms, with $120M raised, also receiving Inc. and Fortune Best Workplaces awards.

Canada

  • Design and operate core AI platform components for training, deploying, and serving ML models at scale.
  • Own model serving and inference workflows end-to-end, optimizing for reliability, latency, throughput, and cost.
  • Collaborate with product, infrastructure, and security teams to build scalable platform capabilities for AI-powered features.

Mozilla Corporation is the non-profit-backed technology company behind Firefox and Pocket, with over 225 million monthly users. A wholly-owned subsidiary of the Mozilla Foundation, the company is mission-driven, employee-owned, and focused on privacy and open standards.

$120,000–$160,000/yr
US

  • Design, develop, and deploy AI/ML models to automate and improve internal workflow.
  • Build and maintain ML pipelines within an AWS cloud environment.
  • Integrate ML capabilities into existing Java and React application workflows.

Oddball aims to improve daily lives by delivering quality software to the federal space. With a team of experienced engineering, product, and UX professionals, we value learning, growth, and making a big impact in a rapidly growing company.

US Unlimited PTO

  • Developing machine learning pipelines and custom analytics for image, video, text, geospatial, time series, and structured data
  • Orchestrating and automating complex data engineering and analytic pipelines
  • Envisioning, specifying, designing, and implementing core product functionality and conducting mission-critical fieldwork

Striveworks helps organizations harness AI to solve national security and business challenges by serving as a command center between data, models, and outcomes. Founded by data scientists and engineers, the company values a high-trust work environment with individual responsibility for collective results.

Canada US

  • Own customer solutions end-to-end, rapidly prototyping and deploying solutions in live operational environments.
  • Build trusted relationships from IC level to executive sponsor, becoming the technical face of the company.
  • Operate as part of a tight, multi-disciplinary unit with focus and urgency, seamlessly trading tasks to whoever is closest to the skills needed.

Kinaxis is a global leader in modern supply chain orchestration, powering complex global supply chains with an AI-infused platform. With over 2000 employees worldwide and 6 global offices, it has been recognized with several Top Employer awards and fosters a culture focused on technology, customers, and innovation.

United States 6w PTO

  • Train, fine-tune, and optimize large language models powering AI companion and conversational systems at scale.
  • Design and maintain agentic frameworks and LLM orchestration systems, including reasoning loops and chat orchestration.
  • Research state-of-the-art NLP techniques and implement alignment methods such as RLHF and DPO to improve model quality.

We are an AI-powered job matching platform that connects candidates with hiring companies through objective, fair review processes. As a globally distributed, innovation-focused company, we foster a collaborative engineering culture with continuous learning opportunities.

US

  • Design, build, and deploy AI/ML solutions from prototype to production for client business problems.
  • Apply generative AI and LLMs, establishing MLOps best practices including CI/CD and model monitoring.
  • Serve as a trusted technical advisor, translating ambiguous problems into well-scoped solutions and presenting to stakeholders.

DevIQ builds modern cloud and data solutions for mid-market companies focused on energy reduction, healthcare, education, and smart cities. The company offers competitive benefits, a strong team culture, and opportunities to work on end-to-end solutions with multi-disciplinary teams.

Global 6w PTO

  • Train and fine-tune language models powering AI companions and own agent harnesses, agentic loops, and chat interface algorithms.
  • Build and maintain the full LLM stack from model training to production deployment while tracking cutting-edge NLP research.
  • Collaborate with validation, content, and dataset preparation teams to design experiments and measure model quality.

Social Discovery Group is one of the world's largest groups of social discovery companies, solving loneliness and disconnection through social entertainment platforms like DateMyAge and Dating.com. The international team of 1000+ professionals works remotely worldwide and is a two-time 'Great Place to Work' winner.

Global

  • Lead the AI and Machine Learning strategy, aligning with business and product objectives.
  • Oversee the research, development, and deployment of AI/ML solutions from concept to production.
  • Mentor teams and educate stakeholders on AI capabilities, limitations, and practical applications.

Smart Working is a remote-first company that connects skilled professionals with global teams for full-time roles. They are one of the highest-rated workplaces on Glassdoor, valuing growth and well-being.

US

  • Design and maintain data pipelines and auto-labeling systems to support ML model training from multimodal data.
  • Write and optimize SQL queries for data extraction, analysis, and ingestion from various sources.
  • Develop and prototype learning-based models using a data-centric approach with techniques like active learning and fine-tuning.

Serve Robotics is reimagining urban delivery with sidewalk robots, aiming to reduce congestion and support local businesses. The team is an agile, diverse group of tech industry veterans focused on robotics, machine learning, and end-to-end user experience.

Switzerland

  • Conduct advanced research on agentic AI systems trained on real-world interaction data.
  • Design and experiment with learning frameworks such as RAG, fine-tuning, RLHF, DPO, and GRPO.
  • Develop multimodal representation learning approaches across text, audio, logs, and structured data.

Our partner is a global AI research organization focused on developing cutting-edge agentic and multimodal AI systems. It offers a collaborative environment with top-tier engineering and product teams.

  • Design, develop, test, and deploy AI/ML models and applications including NLP pipelines, predictive models, recommendation engines, and intelligent automation workflows.
  • Build and integrate large language model (LLM) powered features using APIs such as OpenAI, Azure OpenAI, or Anthropic; implement retrieval-augmented generation (RAG) patterns and AI agent workflows.
  • Develop and maintain data pipelines that support model training, fine-tuning, evaluation, and real-time or batch inference.

ExtensisHR is a Professional Employer Organization (PEO) in the U.S. with client employees in all fifty states. They deliver personalized HR services for HR, employee benefits, payroll and taxes, employer risk, compliance, and employee management.

$180,000–$240,000/yr
US

  • Be part of the alignment research team, working on projects selected for their high upside potential and under-resourced status.
  • Do real alignment research with real autonomy, in directions most organizations aren’t set up to pursue.
  • Break complex problems into concrete experiments and execute on them, independently or with a team.

AE Studio is a 160-person, fully bootstrapped ML consultancy that has spent over a decade building and shipping AI systems for clients. Without outside investors, we put money into alignment research through the AI Alignment Foundation, a nonprofit we founded to scale this work.

Global 4w PTO

  • Take ownership of the ML API serving NBA recommendations and harden it for low-latency production traffic.
  • Ship your first agent tool contract end-to-end: schema design, handler implementation, and unit tests.
  • Set up the eval foundation for agents with golden transcripts, rubric-based judges, and regression suites.

Clutch is a vertical SaaS company backed by Andreessen Horowitz that helps credit unions become fintech lenders, providing affordable lending solutions to over 130 million Americans. The team is small, ambitious, and shipping fast with a culture that values pragmatism and real autonomy.

US

  • Responsible for full software development lifecycle including algorithm development, design, implementation, and testing.
  • Collaborate with machine learning engineers and cross-functional teams to build robust production systems.
  • Drive technical innovation and mentor engineers within the group.

Torc is a leader in autonomous driving technology, developing software for automated trucks. We are part of the Daimler family and have a collaborative, energetic culture.

Global 16w maternity 16w paternity

  • Design, train, evaluate, and ship ML systems for governance and security, starting with prompt injection detection and behavioral anomaly detection.
  • Build supporting infrastructure including data pipelines, feature stores, model serving, and evaluation harnesses.
  • Set technical direction for ML work, own architecture, evaluation methodology, and model lifecycle.

Docker provides developer tools for building, sharing, and running applications across Docker Desktop, Docker Hub, and Docker Scout. With over 20 million monthly users and a globally distributed remote-first team, Docker is trusted by solo founders to the world's largest companies.

US

  • Design, build, and maintain production-grade AI systems and customer-facing AI features.
  • Develop agentic workflows using LLMs, retrieval systems, tools, APIs, and backend services.
  • Design and implement retrieval-augmented generation (RAG) systems, including ingestion pipelines, embeddings, semantic retrieval, and context assembly.

Givzey is a fast-growing and innovative technology company serving the nonprofit sector, on a mission to unlock more generosity through AI-powered donor engagement. In just three years, Givzey’s platform has already helped organizations raise $10M+ through autonomous engagement.