Source Job

Europe US

  • Design and build training pipelines, fine-tuning workflows, and RL infrastructure.
  • Implement data ingestion and curation systems, inference services, and scalability and backend architecture.
  • Own the platform that turns models into production systems.

PyTorch GPU

20 jobs similar to AI Platform Engineer

Jobs ranked by similarity.

India 7w PTO

  • Design and implement production-scale AI agent systems and orchestration frameworks.
  • Deploy and optimize LLMs/SLMs in production with fine-tuning techniques.
  • Build data pipelines for training data curation, synthetic generation, and PII masking.

FourKites is the leader in AI-driven supply chain transformation for global enterprises and a pioneer of advanced real-time visibility. They turn supply chain data into automated action helping 1,600+ global brands prevent disruptions. They provide competitive compensation with stock options, outstanding benefits and a collaborative culture for all employees around the globe.

Canada 3w PTO

  • Own model serving: Design, build, and maintain low-latency, highly-available serving stacks for in-house ML model serving and integrating with LLM serving partners.
  • Automate training pipelines: Orchestrate data prep, training, evaluation, and registry workflows on Kubernetes with solid MLOps practices.
  • Optimize at scale: Profile and tune throughput, memory, and cost; introduce caching, sharding, batching, and GPU/CPU autoscaling where it pays off.

Cresta aims to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Their platform combines AI and human intelligence to help contact centers discover customer insights and automate conversations.

$135,000–$175,000/yr
US

  • Architect and scale the core intelligence behind our platform.
  • Design, build, and optimize the pipelines and agent systems that drive live customer interactions.
  • Build real-time and batch pipelines for ingestion, training, and inference.

Raynmaker is building RaynBrain, an agentic AI platform for complex conversations grounded in machine learning, neuroscience, and forensic linguistics. They empower autonomous systems that interpret, adapt, and act in real time, turning raw leads into revenue without scripts or human handoffs. Raynmaker is a small team helping other small teams move faster and convert more leads.

$150,000–$180,000/yr
US

  • Design and implementation of reliable, maintainable, and scalable GenAI systems.
  • Serve as a subject matter expert for machine learning systems owned by the team.
  • Mentor junior and mid level engineers through code reviews and design collaboration.

Trajector specializes in medical evidence services, guiding clients through disability benefits complexities. They are a global team of over 1,800 dedicated individuals, streamlining the path to benefits and ensuring access to rightful compensation for those with disabilities.

Europe Unlimited PTO

  • Design, build, and maintain the inference infrastructure that powers Sword Health's AI products.
  • Own the end-to-end deployment pipeline for AI models, from real-time computer vision to large language models.
  • Architect and scale Kubernetes clusters for GPU-accelerated workloads, including autoscaling strategies and resource scheduling.

Sword Health is shifting healthcare from human-first to AI-first through its AI Care platform. They make world-class healthcare available anytime, anywhere, while significantly reducing costs. Sword Health has over 1,000 enterprise clients and has raised more than $500 million from leading investors.

Global

  • Fine-tuning pre-trained LLMs on small to medium datasets.
  • Implementing parameter-efficient fine-tuning (e.g., LoRA-style methods).
  • Optimising training for cost and performance.

They deliver cutting-edge ML and GenAI solutions across diverse industries and collaborate with global organizations. The company solves real-world challenges at scale with dynamic, high-impact projects.

US Unlimited PTO

  • Architect and deploy autonomous AI agents and multi-agent workflows.
  • Design strict-source-following Retrieval-Augmented Generation (RAG) systems.
  • Build scalable backend services using FastAPI.

Osano is an innovative B-Corporation focused on giving modern enterprises the ability to innovate quickly and earn customer trust by respecting data privacy and complying with consent guidelines. We are scaling fast with a multi-year runway and ambitious growth plans.

US

  • Build and deploy end-to-end AI/ML solutions, from data pipelines and feature engineering to model training and inference
  • Develop and maintain data pipelines for ingesting, transforming, and preparing data for analytics and machine learning
  • Write clean, modular, and maintainable code to support scalable AI applications

Eimagine fosters a remote-enabled environment where their people can thrive. They are a team of professionals who take pride in their craft, continuously learn, and support one another, helping clients navigate technology and business change while delivering meaningful outcomes.

US

  • Set up and manage GPU cluster infrastructure on major cloud providers.
  • Build and operate job orchestration and scheduling systems.
  • Integrate and maintain ML training frameworks and post-training pipelines.

Snorkel AI helps enterprises transform expert knowledge into specialized AI at scale. They started as a research project in the Stanford AI Lab and work with some of the world’s largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before.

Europe

  • Build and manage the full ML lifecycle—from experiment tracking to model deployment and retraining.
  • Implement ML-specific CI/CD (e.g., CML, Kubeflow Pipelines) to automate the promotion of models to production.
  • Architect distributed systems for large-scale model inference.

Deutsche Telekom IT Solutions is a subsidiary of the Deutsche Telekom Group, recognized as Hungary’s most attractive employer in 2025. They provide IT and telecommunications services with more than 5300 employees, serving hundreds of large customers in Germany and other European countries.

Europe

  • Serve as Zencore’s senior-most technical authority on the practical application of advanced artificial intelligence and machine learning.
  • Partner with the sales and business development teams in a pre-sales capacity to scope opportunities, design solutions for proposals, and act as the senior technical voice in client pitches.
  • Lead the architecture and design of sophisticated, secure, and scalable AI solutions for our clients, moving beyond standard API integrations to create genuine competitive advantages.

Zencore is a fast-growing company founded by former Google Cloud leaders, architects, and engineers. Our engagements eliminate obstacles, reduce risk, and accelerate timelines for customers adopting and scaling modern AI solutions.

$35–$50/hr
Global

  • Design and implement LLM-powered application workflows
  • Architect retrieval-augmented generation pipelines
  • Collaborate with backend architects to integrate AI services into APIs

They are seeking a hands-on AI Engineer with deep expertise in Large Language Model integration and production AI systems. The company's culture sounds innovative and collaborative, focusing on building scalable and secure AI applications.

Europe Unlimited PTO

  • Contribute to designing, evaluating, and shipping our mental health AI Agent and its supporting infrastructure.
  • Develop and maintain robust data pipelines to power model training and evaluation.
  • Partner with AI Research, Product, and Engineering teams to define new features.

Sword Health is shifting healthcare from human-first to AI-first through its AI Care platform. They aim to make world-class healthcare available anytime, anywhere, while significantly reducing costs. Backed by clinical studies and patents, Sword Health has raised more than $500 million from leading investors.

US

  • Design and implement production-grade RAG pipelines and agentic workflows using Python.
  • Evaluate new models and prototype approaches for SBIR/government deliverables.
  • Document architectures and contribute to technical reports for contract deliverables.

Unstructured is focused on transforming unstructured data into a format usable by LLMs. Their Public Sector team works on high-impact contracts and seek to bridge the gap between custom builds and a scalable product roadmap.

US

  • Work with customers to develop requirements and scope for new AI/ML projects.
  • Develop computer vision and machine learning based solutions for inspection platforms.
  • Analyze large datasets to extract meaningful insights and drive business decisions.

Loram provides advanced insights into inspection data collected for customers worldwide. The company has a small, collaborative team managing the entire project lifecycle, offering employees an outsized impact on inspections and maintenance recommendations.

Europe

  • Design, implement, and maintain frontend and graph-level compiler components using MLIR
  • Develop and optimize graph-level transformations such as operator fusion, constant folding, operator sinking, graph partitioning, and other performance-critical optimizations
  • Extend and maintain MLIR dialects, passes, and infrastructure to support AI workloads

Axelera AI creates the next-generation AI platform. It is a team of 220+ employees (including 49+ PhDs with more than 40,000 citations), with offices in multiple European countries and is headquartered at the High Tech Campus in Eindhoven, Netherlands.

Europe

  • Build and ship AI-powered product and internal solutions using LLMs, RAG, tool calling, workflows, and agentic patterns
  • Design quality and evaluation frameworks for AI systems, including offline evals, online signals, failure analysis, and continuous improvement loops
  • Contribute to AI platform and tooling decisions that improve reuse, speed, and consistency across teams

Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial landscape for entrepreneurs. They develop an all-in-one financial B2B solution integrating banking, accounting, financial management, and invoicing into a mobile-first platform and nurture innovation in an inspiring work environment.

$1,200,000–$1,500,000/yr
US

  • Develop and productionize machine learning (ML) solutions in the fields of Document understanding, Search and QnA, GenAI, Virtual Agents, etc.
  • Develop and maintain backend services using Python, focusing on AI-driven applications.
  • Design and implement APIs for seamless integration with AI models and services.

Judi Health is an enterprise health technology company providing a comprehensive suite of solutions for employers and health plans. They are rebuilding trust in healthcare in the U.S. and deploying the infrastructure we need for the care we deserve.

$70,000–$87,000/yr
Argentina Mexico Unlimited PTO

  • Architect the ML Ecosystem.
  • Productionize Innovation.
  • Engineer Feature Intelligence.

TrueML is a mission-driven financial software company that aims to create better customer experiences for distressed borrowers. The TrueML team includes inspired data scientists, financial services industry experts and customer experience fanatics building technology to serve people.

Global

  • Design and build the infrastructure layer powering AI agent systems in production
  • Develop high-performance Rust services that handle model inference, orchestration, and execution
  • Architect scalable systems capable of supporting millions of users and high request throughput

Kraken is a mission-focused company rooted in crypto values, aiming to accelerate the global adoption of crypto so that everyone can achieve financial freedom and inclusion. As a fully remote company, Kraken has employees in 70+ countries and is committed to industry-leading security, crypto education, and client support.