Source Job

Global

  • Contribute to the development of the Everywhere Inference platform, a Kubernetes-based solution.
  • Design and implement APIs and developer tools to simplify deployment, management, and monitoring of AI applications.
  • Optimize serverless container workflows for AI workloads, ensuring performance, scalability, and seamless autoscaling.

Python Kubernetes AI/ML Docker MLOps

14 jobs similar to Software Engineer (Python, Kubernetes, AI/ML)

Jobs ranked by similarity.

Europe

  • Design, build, and maintain scalable services that support the AI lifecycle.
  • Develop tools for pre/post-processing data for AI and other usage.
  • Design scalable pipelines for data collection, processing, and transformation.

Planner 5D is a global hub for home design, uniting over 100+ million users. They simplify the home renovation process with their cutting-edge software, fostering a vibrant community of enthusiastic and product-oriented professionals.

US

  • Own technical direction for high-impact AI products.
  • Work across teams to turn big ideas into shipped systems.
  • Help raise the bar for how we build, evaluate, and operate AI in production.

Rula is dedicated to treating the whole person, not just the symptoms, and aims to create a world where mental health is no longer stigmatized. They are a remote-first company that hires in most U.S. states and are passionate about making a positive impact on mental healthcare.

Canada

  • Define, drive, design, and build/ship end-to-end solutions that solve real customer problems.
  • Contribute to the end-to-end AI/ML software development lifecycle, ensuring reproducible research.
  • Drive architecture, design, and delivery of advanced ML systems in the Product R&D team.

Kinaxis is a global leader in modern supply chain orchestration. Known for its AI-infused platform and transparency across end-to-end supply chains, Kinaxis helps customers make faster, better decisions. The company has over 2000 employees worldwide and is recognized with Top Employer awards.

Global

  • Own and operate GPU and accelerator clusters for AI training, inference, and experimentation, ensuring reliability and cost-efficiency.
  • Build and optimize scheduling, orchestration, and serving systems using frameworks like vLLM and Triton to improve latency, throughput, and memory efficiency.
  • Partner with ML engineers to remove workflow bottlenecks and build observability for GPU utilization, capacity, and incident response.

Kraken is a crypto exchange platform building premium financial products for traders and institutions, accelerating global crypto adoption. It is a mission-driven, fully remote company with a world-class team of crypto experts spread across more than 70 countries.

Europe

  • Define and evolve the architecture and roadmap for enterprise‑scale Data and AI platforms.
  • Design and build multi‑tenant, multi‑region, highly available AI platforms with governance.
  • Lead capacity planning and cost optimization strategies for GPU and CPU workloads.

NEORIS accelerates growth in Ibero‑America, combining global engineering with regional expertise. With over 60,000 professionals across 55+ countries, they offer technical specialization career paths and value responsibility, collaboration, creativity, and commitment.

SRE

Fal
$180,000–$250,000/yr
US

  • Own and operate our Kubernetes infrastructure.
  • Build and maintain CI/CD pipelines and deployment infrastructure.
  • Leverage AI to automate analysis and resolution of production issues.

Fal is the generative media ecosystem powering the next generation of AI products. They build the infrastructure, tools, and model access that teams need to move from idea to production.

$81,112–$92,025/yr
Europe

  • Empower ML Engineers with the tools, infrastructure, and frameworks they need to iterate fast autonomously.
  • Accelerate time-to-market for production-ready ML products with seamless integration and access to data and resources.
  • Own ML CI/CD in close collaboration with the DevExp team, adapting existing frameworks to ML-specific needs.

Dailymotion is a video platform designed to broaden users' horizons with a unique algorithm. They foster inclusivity and aim to build a better and safer Internet with cutting-edge solutions for video hosting and advertising. With 400 employees in France, New York, and Singapore, Dailymotion is shaking up the global video platform ecosystem.

Poland

  • Design and deploy GPU cluster architectures using tools like Ansible, Terraform, Kubernetes, and Slurm.
  • Lead technical deep-dives, workshops, and present solutions to stakeholders, translating complex concepts.
  • Automate provisioning and monitoring with Infrastructure as Code, and produce documentation and training materials.

Gcore is a global provider of infrastructure and software solutions for AI, cloud, network, and security, powering digital experiences worldwide. The company collaborates with leading technology partners and employs over 550 professionals building foundational technologies.

$95,786–$119,733/yr
Canada 4w PTO

  • Design and develop full stack features for our AI-powered web application using Python, React, Redux, PostgreSQL, and AWS
  • Integrate AI models and APIs for tasks like document processing, data extraction, recommendation engines etc.
  • Construct scalable, secure, and observable cloud architectures for our core AI services

PolicyMe is Canada’s leading digital insurance solution, offering straightforward and affordable financial protection for families. They operate with a remote-first culture, attracting top talent from across Canada and have sold over $10 billion in insurance coverage.

  • Design, develop, test, and deploy AI/ML models and applications including NLP pipelines, predictive models, recommendation engines, and intelligent automation workflows.
  • Build and integrate large language model (LLM) powered features using APIs such as OpenAI, Azure OpenAI, or Anthropic; implement retrieval-augmented generation (RAG) patterns and AI agent workflows.
  • Develop and maintain data pipelines that support model training, fine-tuning, evaluation, and real-time or batch inference.

ExtensisHR is a Professional Employer Organization (PEO) in the U.S. with client employees in all fifty states. They deliver personalized HR services for HR, employee benefits, payroll and taxes, employer risk, compliance, and employee management.

US

  • Design, build, and maintain the core infrastructure layer supporting GenAI products.
  • Implement secure access controls and authentication mechanisms integrated by default into the AI platform components.
  • Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure.

PointClickCare is a healthcare technology company. This team will serve as the product owner for GenAI capabilities, closely integrated with key horizontal partners to ensure delivery of safe, scalable and high-impact AI Products.

US Unlimited PTO

  • Maintain, improve, and extend an AI platform already running in production.
  • Handle a mix of backend development, data pipelines, DevOps, and infrastructure work.
  • Translate business and product requirements into technical decisions independently.

Provectus is an AI consultancy and solutions provider. We help businesses adopt AI technologies, offering development and integration services. While the job posting doesn't mention company size information, they seem to foster a flexible, autonomous, and tech-forward culture.

  • Shaping the Python language ecosystem with a strong product and platform mindset.
  • Architecting, building and delivering high-impact solutions that uplift the Python developer experience.
  • Advocating for Python engineering best practices across the organization.

Canva is a design platform that empowers users to create professional-quality graphics. They offer an inclusive culture with employees across multiple locations.

$120,000–$160,000/yr
US

  • Design, develop, and deploy AI/ML models to automate and improve internal workflow.
  • Build and maintain ML pipelines within an AWS cloud environment.
  • Integrate ML capabilities into existing Java and React application workflows.

Oddball aims to improve daily lives by delivering quality software to the federal space. With a team of experienced engineering, product, and UX professionals, we value learning, growth, and making a big impact in a rapidly growing company.