Job Description
This role involves:
- Architecting, implementing, and optimizing end-to-end AI inference services and agentic pipelines using Python.
- Designing autonomous AI agents that can interpret, reason about, and act on video and multi-modal inputs.
- Integrating Vision Language Models (e.g., GPT-4o, Gemini Pro Vision, LLaVA) into production-grade workflows.
You will be expected to:
- Utilize LLM/agent orchestration frameworks (LangGraph, AutoGen, Semantic Kernel, etc.) to manage complex visual AI tasks.
- Deploy and operate AI services on Kubernetes or similar platforms, ensuring reliability and scalability under heavy workloads.
- Optimize workloads for modern NVIDIA GPU architectures (Ampere, Hopper, Blackwell) focusing on real-time, high-throughput media applications.
The ideal candidate should:
- Have extensive professional experience designing and shipping AI/ML systems in production, with strong Python expertise.
- Possess a proven track record of taking AI/ML models from prototype to robust, low-latency inference services.
- Have hands-on experience building agentic systems, especially with computer vision or multi-modal inputs.
About Jobgether
Jobgether uses an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against a role's core requirements.