Similar Jobs

See all

Senior Machine Learning Engineer, AI Platform

Mozilla

Canada

Python Machine Learning Cloud Infrastructure

Staff Machine Learning Engineer, Embeddings Platform

US

Python C++ Machine Learning

Compiler Engineer – MLIR / PyTorch Infrastructure

Mythic

North America

C++ Python PyTorch

Model Performance Engineer

Fathom

Python LLM CUDA

About the Role:

You will own meaningful subsystems of the inference platform from design through production.
Be the go-to engineer for areas like model onboarding, serving APIs, and performance optimization.

Responsibilities:

Design and implement robust API layers and developer SDKs for seamless inference.
Optimize inference performance across the entire system stack.
Decompose ambiguous work and raise the engineering bar through mentoring and code quality.

Qualifications:

Bachelor's or Master's in CS or related field with 4+ years of backend distributed systems experience.
Strong data and ML systems fundamentals with hands-on inference service experience on GPUs.
Proficiency in C++, Go, Rust, or Python and familiarity with inference engines like TensorRT or vLLM.

Stack AV

Stack develops revolutionary AI and autonomous systems to enhance safety and efficiency in trucking. The team has decades of experience deploying real-world systems and is committed to inclusion, entrepreneurship, and innovation.

Apply for This Position

Senior Software Engineer, Machine Learning Inference Platform