Own the end-to-end lifecycle of ML model deployment—from training artifacts to production inference services.
Design, build, and maintain scalable inference pipelines using modern orchestration frameworks (e.g., Kubeflow, Airflow, Ray, MLflow).
Implement and optimize model serving infrastructure for latency, throughput, and cost efficiency across GPU and CPU clusters.

QUALIFICATIONS:

5+ years of experience in applied ML or ML infrastructure engineering.
Proven expertise in model serving and inference optimization (TensorRT, ONNX, vLLM, Triton, DeepSpeed, or similar).
Strong proficiency in Python, with experience building APIs and pipelines using FastAPI, PyTorch, and Hugging Face tooling.

MARA

MARA is building a modular platform that unifies IaaS, PaaS, and SaaS which will enable governments, enterprises, and AI innovators to deploy, scale, and govern workloads across data centers, edge environments, and sovereign clouds. They are redefining the future of sovereign, energy-aware AI infrastructure.

Apply for This Position

Senior ML Engineer – ML/Inference

Similar Jobs

AI Model Serving Specialist

Senior Machine Learning Engineer

Senior AI/ML Software Engineer

Senior Engineer, Machine Learning - Remote US

Senior AI Inference Engineer

MARA