Similar Jobs
See allSenior Machine Learning Engineer, AI Platform
Mozilla
Canada
Python
Machine Learning
Cloud Infrastructure
Staff Machine Learning Engineer, Embeddings Platform
US
Python
C++
Machine Learning
Compiler Engineer – MLIR / PyTorch Infrastructure
Mythic
North America
C++
Python
PyTorch
Software Engineer
Fal
US
Python
Rust
AI/ML
Model Performance Engineer
Fathom
Python
LLM
CUDA
About the Role:
- You will own meaningful subsystems of the inference platform from design through production.
- Be the go-to engineer for areas like model onboarding, serving APIs, and performance optimization.
Responsibilities:
- Design and implement robust API layers and developer SDKs for seamless inference.
- Optimize inference performance across the entire system stack.
- Decompose ambiguous work and raise the engineering bar through mentoring and code quality.
Qualifications:
- Bachelor's or Master's in CS or related field with 4+ years of backend distributed systems experience.
- Strong data and ML systems fundamentals with hands-on inference service experience on GPUs.
- Proficiency in C++, Go, Rust, or Python and familiarity with inference engines like TensorRT or vLLM.
Stack AV
Stack develops revolutionary AI and autonomous systems to enhance safety and efficiency in trucking. The team has decades of experience deploying real-world systems and is committed to inclusion, entrepreneurship, and innovation.