$82,300–$140,580/yr
- Deploy and optimize ML/LLM models on platforms like NVIDIA Triton and vLLM within Kubernetes clusters.
- Integrate models with Rackspace’s Unified Inference API and API Gateway for multi-tenant routing.
- Configure telemetry for GPU utilization, request tracing, and error monitoring.
We combine our expertise with the world’s leading technologies — across applications, data and security — to deliver end-to-end solutions.