Similar Jobs
See allAI Model Serving Specialist
Rackspace
US
Python
Docker
Kubernetes
Senior Machine Learning Engineer
Jobgether
US
Python
SQL
MLOps
Senior AI/ML Software Engineer
BJAK
Python
TensorFlow
PyTorch
Senior Engineer, Machine Learning - Remote US
Jobgether
US
Python
Docker
Kubernetes
Senior AI Inference Engineer
Jobgether
Latin America
Python
Kubernetes
AWS
ESSENTIAL DUTIES AND RESPONSIBILITIES:
- Own the end-to-end lifecycle of ML model deployment—from training artifacts to production inference services.
- Design, build, and maintain scalable inference pipelines using modern orchestration frameworks (e.g., Kubeflow, Airflow, Ray, MLflow).
- Implement and optimize model serving infrastructure for latency, throughput, and cost efficiency across GPU and CPU clusters.
QUALIFICATIONS:
- 5+ years of experience in applied ML or ML infrastructure engineering.
- Proven expertise in model serving and inference optimization (TensorRT, ONNX, vLLM, Triton, DeepSpeed, or similar).
- Strong proficiency in Python, with experience building APIs and pipelines using FastAPI, PyTorch, and Hugging Face tooling.
MARA
MARA is building a modular platform that unifies IaaS, PaaS, and SaaS which will enable governments, enterprises, and AI innovators to deploy, scale, and govern workloads across data centers, edge environments, and sovereign clouds. They are redefining the future of sovereign, energy-aware AI infrastructure.