Job Description

This role involves:

Architecting, implementing, and optimizing end-to-end AI inference services and agentic pipelines using Python.
Designing autonomous AI agents that can interpret, reason about, and act on video and multi-modal inputs.
Integrating Vision Language Models (e.g., GPT-4o, Gemini Pro Vision, LLaVA) into production-grade workflows.

You will be expected to:

Utilize LLM/agent orchestration frameworks (LangGraph, AutoGen, Semantic Kernel, etc.) to manage complex visual AI tasks.
Deploy and operate AI services on Kubernetes or similar platforms, ensuring reliability and scalability under heavy workloads.
Optimize workloads for modern NVIDIA GPU architectures (Ampere, Hopper, Blackwell) focusing on real-time, high-throughput media applications.

The ideal candidate should:

Have extensive professional experience designing and shipping AI/ML systems in production, with strong Python expertise.
Possess a proven track record of taking AI/ML models from prototype to robust, low-latency inference services.
Have hands-on experience building agentic systems, especially with computer vision or multi-modal inputs.

About Jobgether

Jobgether uses an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against a role's core requirements.

Apply for This Position