Job Description

This role involves:

  • Architecting, implementing, and optimizing end-to-end AI inference services and agentic pipelines using Python.
  • Designing autonomous AI agents that can interpret, reason about, and act on video and multi-modal inputs.
  • Integrating Vision Language Models (e.g., GPT-4o, Gemini Pro Vision, LLaVA) into production-grade workflows.

You will be expected to:

  • Utilize LLM/agent orchestration frameworks (LangGraph, AutoGen, Semantic Kernel, etc.) to manage complex visual AI tasks.
  • Deploy and operate AI services on Kubernetes or similar platforms, ensuring reliability and scalability under heavy workloads.
  • Optimize workloads for modern NVIDIA GPU architectures (Ampere, Hopper, Blackwell) focusing on real-time, high-throughput media applications.

The ideal candidate should:

  • Have extensive professional experience designing and shipping AI/ML systems in production, with strong Python expertise.
  • Possess a proven track record of taking AI/ML models from prototype to robust, low-latency inference services.
  • Have hands-on experience building agentic systems, especially with computer vision or multi-modal inputs.

About Jobgether

Jobgether uses an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against a role's core requirements.

Apply for This Position