Jobs Similar to ML Platform Engineer

Senior MLOps Engineer - Hudl Focus

Hudl 14 days ago

Europe

Build scalable Edge infrastructure, designing and maintaining delivery systems for model deployment.
Work with cross-functional teams to integrate complex features, translating research into hardware realities.
Drive automation and reliability by implementing infrastructure to test models and monitor performance.

Hudl builds great teams and hires the best to ensure employees are working with people they can constantly learn from. They provide a culture where everyone feels supported, becoming one of Newsweek's Top 100 Global Most Loved Workplaces.

View details Similar jobs

ML Ops Engineer

Pragmatike 6 days ago

Europe

Build and operate production-grade model serving infrastructure using frameworks such as vLLM, TGI, Triton, or equivalent
Design and implement robust deployment pipelines with blue/green and canary rollout strategies for ML models
Develop and maintain auto-scaling systems, multi-model serving architectures, and intelligent request routing layers

Pragmatike is recruiting on behalf of a fast-scaling, well-funded distributed cloud infrastructure startup building next-generation AI-native cloud services. The company is redefining how compute is delivered by providing GPU-powered infrastructure for AI/ML workloads, secure storage, and high-speed data transfer through a decentralized architecture that significantly reduces environmental impact compared to traditional cloud providers.

View details Similar jobs

Platform Engineer

Vectara 29 days ago

US

Build and maintain infrastructure-as-code for our AWS EKS and GCP GKE clusters, plus on-premises deployments.
Own CI/CD pipelines and drive GitOps adoption.
Deploy, scale, and optimize ML/NLP inference workloads.

Vectara is the Enterprise Agent Platform that enables businesses to build and deploy governed, grounded, auditable AI agents across SaaS, VPC, and on-prem. We’re a passionate team that’s hyper-focused on solving enterprise-level technology and business problems with AI.

View details Similar jobs

Machine Learning Platform Engineer

Buzz Solutions 1 day ago

US

Design, build, and maintain scalable training infrastructure for computer vision workloads
Implement and manage distributed training pipelines to support large-scale model training and hyperparameter tuning
Build and maintain robust data pipelines for ML development

Buzz is revolutionizing the analytics and maintenance of power grid infrastructure through their advanced AI solutions. Their computer vision systems analyze critical infrastructure to enhance safety, reliability, and operational efficiency across the power grid network.

View details Similar jobs

Lead MLOps Engineer

NexGen Cloud 2 days ago

5w PTO

Own the design, implementation, and evolution of core MLOps systems across Hyperstack.
Build and improve systems that orchestrate model training, fine-tuning, evaluation, and deployment.
Define and embed strong MLOps practices across teams.

NexGen Cloud is the company behind Hyperstack, a full-stack AI cloud serving tens of thousands of customers from AI researchers to enterprises running the world's most compute-intensive workloads. They deliver on-demand and private GPU infrastructure to teams who treat performance as a requirement, not a feature.

View details Similar jobs

Platform Engineer

Vantage 21 days ago

$175,000–$210,000/yr

US

Design and build the core data infrastructure powering Vantage's platform.
Own architecture decisions for systems built on ClickHouse, Temporal, Kubernetes, and Postgres.
Drive reliability, performance, and scalability initiatives across the platform as data volume and customer load grows

Vantage is the FinOps platform built for modern engineering teams. They are a high-output team of ~50 employees based in New York City with a remote-friendly culture.

View details Similar jobs

Platform Engineer

P-1 AI 17 days ago

$200,000–$250,000/yr

US Canada Unlimited PTO

Design the BYOC deployment model for Archie across customer environments.
Build and own Kubernetes-based infrastructure that runs reliably across multiple clouds and customer setups.
Create deployment tooling using Helm, GitOps, or similar approaches to make installation and operations repeatable.

P-1 AI is building an engineering AGI with their first product, Archie, an AI engineer. They closed a $23 million seed round and aim to put an Archie on every engineering team at every industrial company on earth.

View details Similar jobs

Staff Machine Learning Engineer

Striveworks 9 days ago

$200,000–$250,000/yr

US Unlimited PTO

Work with customers, engineers, and other stakeholders to define clear requirements that solve the customers’ problems and leverage the capabilities of our AI operations platform.
Translate requirements into a technical approach, design, scoping estimate, and execution plan.
Lead execution teams to achieve on-time completion of project deliverables mapped to customer business value while making key individual contributions throughout the process.

Striveworks helps organizations harness the power of artificial intelligence to solve real-world national security and business challenges. Founded by data scientists and engineers, they set out to make the journey from deployment to ongoing optimization simple and effective.

View details Similar jobs

Software Engineer, Backend

Aura 20 days ago

US

Develop and enhance backend features, ensuring system reliability and scalability.
Collaborate with stakeholders to define requirements and improve system performance.
Manage infrastructure using Terraform and other infrastructure-as-code tools.

Aura is on a mission to create a safer internet, offering a suite of intelligent digital safety products that help millions of customers protect themselves against digital threats. With over 400 employees worldwide, Aura is guided by experienced leadership and fostering an inclusive community.

View details Similar jobs

Senior or Staff ML Systems Engineer

Quilter 18 days ago

US Unlimited PTO

Design, build, and maintain ML infrastructure across training, evaluation, serving, and monitoring
Own data pipelines including generation, cleaning, validation, and versioning
Build and improve experiment tracking, orchestration, and reproducibility tooling

Quilter is helping electrical engineers save time and accomplish more by automating the tedious and time-consuming task of designing printed circuit boards (PCBs). Their small team is composed of experts in electrical engineering, electromagnetic simulation, ML/AI, and high-performance computing (HPC).

View details Similar jobs

Senior Platform Engineer

PerfectServe 16 days ago

$130,000–$160,000/yr

US

Design, build, and optimize cloud platform capabilities.
Tackle complex infrastructure challenges and raise engineering quality.
Apply AI and AIOps to make the platform smarter and more resilient.

PerfectServe offers Best in KLAS clinical communication and physician scheduling solutions and is a Leader in the Gartner Magic Quadrant for Clinical Communication and Collaboration. We focus on optimizing provider schedules and dynamically routing messages to advance patient care and clinical workflows, valuing growth, transparency, and innovation.

View details Similar jobs

Senior AI Infrastructure Engineer (Europe based - Remote)

Sword Health 1 day ago

$71,884–$112,939/yr

Europe Unlimited PTO

Design, build, and maintain the inference infrastructure that powers Sword Health's AI products.
Own the end-to-end deployment pipeline for AI models.
Architect and scale Kubernetes clusters for GPU-accelerated workloads.

Sword Health is shifting healthcare from human-first to AI-first through its AI Care platform, making healthcare available anytime, anywhere, and reducing costs. They have over 1,000 enterprise clients and have raised more than $500 million from leading investors.

View details Similar jobs

Forward Deployed Engineer (FDE)

Striim 12 days ago

$205,000–$220,000/yr

US

Partner with Sales and Field Engineering to design and architect complex, enterprise-grade solutions tailored to customer needs.
Lead the implementation of custom solutions within customer environments across multi-cloud and hybrid architectures.
Optimize solutions for performance, scalability, and reliability in production environments.

Striim is a unified data integration and streaming platform that connects clouds, data, and applications. We believe and expect all of our employees to operate as one with unlimited potential and dignity.

View details Similar jobs

Site Reliability Engineer, Production Reliability

Yelp 17 days ago

$135,000–$185,000/yr

Canada

Working with engineers across Yelp in supporting new features and services.
Integrating tools to monitor platform stability and performance.
Help scale our Kubernetes clusters and AWS-based infrastructure while maintaining our platform's SLOs.

Yelp's engineering culture values individual authenticity and encourages creative solutions. They focus on helping users, growing as engineers, and having fun in a collaborative environment.

View details Similar jobs

Senior Software Engineer, AI Platform

JumpCloud 1 hour ago

Turkey

Lead the strategy and architecture for a scalable AI platform that integrates model orchestration, tool integration, and real-time decision systems.
Design, develop, and maintain the platform with full ownership from ideation to deployment, ensuring reliability, observability, and security.
Mentor engineers and collaborate across teams to evangelize AI best practices and drive the integration of AI throughout the product development lifecycle.

JumpCloud is an AI-powered unified IT management platform designed to secure the modern workforce by consolidating identity, device, and access management. The company is remote-first with teams in over 15 countries, fostering a culture that values building connections, out-of-the-box thinking, and passionate collaboration on challenging technical problems.

View details Similar jobs

Senior Site Reliability Engineer

SSV Labs 14 days ago

Global

Design and implement infrastructure and tools that empower our product teams to rapidly and securely iterate, emphasizing reliability and automation.
Influence the strategic direction of our infrastructure and operational practices, ensuring that we are well-positioned to scale and support our growing organization.
Take a proactive role in the resolution of production issues, ensuring that we are well-prepared to handle incidents and that we learn from them in a blameless manner.

SSV Labs is the core team behind the SSV Network - pioneering decentralized infrastructure for Ethereum staking. They are building tools, protocols, and standards to make staking more secure, scalable, and trustless.

View details Similar jobs

AI Operations Engineer

Newsela 24 days ago

LATAM

Design and maintain CI/CD pipelines for ML model training, packaging, and deployment across our microservices.
Manage containerized services on AWS ECS, optimizing for cost, latency, and availability.
Automate infrastructure provisioning and service configuration with Terraform.

Newsela takes authentic, real-world content from trusted sources and makes it instruction-ready for K-12 classrooms. Each text is published at five reading levels, so content is accessible to every learner; over 3.3 million teachers and 40 million students have registered.

View details Similar jobs

Machine Learning Engineer - Inference Platform (AU remote)

Canva 17 days ago

Australia

You’ll design, build, and maintain scalable systems for serving machine learning models in production.
You’ll optimise inference performance, including latency, throughput, and cost efficiency.
You’ll collaborate with ML researchers and engineers to productionise models

Canva is a design platform that enables users to create a variety of visual content. They have campuses in Sydney and Melbourne, with co-working spaces in other Australian cities, and promote a flexible work environment.

View details Similar jobs

VP of Engineering

ScienceLogic 6 days ago

US Unlimited PTO

Own the Skylar AI Engineering Vision.
Build & Lead a World-Class Engineering Organization.
Drive Cross-Functional Alignment & Executive Partnership.

ScienceLogic is redefining IT operations for the modern enterprise. Their AIOps platform empowers organizations to achieve Autonomic IT, helping enterprises and service providers gain unified visibility across hybrid and multi-cloud environments.

View details Similar jobs

Staff/Principal Platform Engineer

P2P.org 6 days ago

Global

Own the architecture and evolution of P2P.org's internal developer platform—Kubernetes, monitoring, secrets management, and delivery infrastructure.
Design and build scalable, fault-tolerant platform components—including capacity planning, multi-tenancy, networking topology, and storage architecture.
Use AI tooling as a core part of how you work and champion its adoption across the infrastructure team and wider engineering organization.

P2P.org is the largest institutional staking provider with a TVL of over $10B and a market share exceeding 20% in restaking. They unite talented individuals globally and prioritize customer satisfaction, developing innovative solutions.

View details Similar jobs

Source Job