Source Job

LATAM

  • Design and maintain CI/CD pipelines for ML model training, packaging, and deployment across our microservices.
  • Manage containerized services on AWS ECS, optimizing for cost, latency, and availability.
  • Automate infrastructure provisioning and service configuration with Terraform.

Python Docker Terraform AWS GCP

20 jobs similar to AI Operations Engineer

Jobs ranked by similarity.

Europe

  • Manage cloud infrastructure and optimize costs, particularly in AWS environments using Terraform and Python.
  • Design, develop, and maintain CI/CD pipelines and infrastructure for AI model training and deployment.
  • Ensure platform scalability and efficient resource utilization.

NEORIS, now part of EPAM Systems, is a Digital Accelerator that helps companies step into the future. With more than 20 years of experience as Digital Partners to some of the world’s leading organizations, they are over 4,000 professionals across 11 countries and foster a multicultural, startup-minded culture that promotes innovation, continuous learning, and the delivery of high-impact solutions for their clients.

US

  • Build and maintain infrastructure-as-code for our AWS EKS and GCP GKE clusters, plus on-premises deployments.
  • Own CI/CD pipelines and drive GitOps adoption.
  • Deploy, scale, and optimize ML/NLP inference workloads.

Vectara is the Enterprise Agent Platform that enables businesses to build and deploy governed, grounded, auditable AI agents across SaaS, VPC, and on-prem. We’re a passionate team that’s hyper-focused on solving enterprise-level technology and business problems with AI.

Latin America

  • 3+ years of coding experience with Python.
  • Advanced knowledge of AWS services including ML services (AWS SageMaker and AWS Step Functions).
  • Experience with ML monitoring and automation tools (MLflow, SagaMaker Pipelines).

Bluelight is a leading software consultancy dedicated to designing and developing innovative technology that enhances users' lives. With a presence across the United States and Central/South America, Bluelight is in an exciting phase of expansion.

Europe

  • Build and manage the full ML lifecycle—from experiment tracking to model deployment and retraining.
  • Implement ML-specific CI/CD (e.g., CML, Kubeflow Pipelines) to automate the promotion of models to production.
  • Architect distributed systems for large-scale model inference.

Deutsche Telekom IT Solutions is a subsidiary of the Deutsche Telekom Group, recognized as Hungary’s most attractive employer in 2025. They provide IT and telecommunications services with more than 5300 employees, serving hundreds of large customers in Germany and other European countries.

$70,000–$87,000/yr
Argentina Mexico Unlimited PTO

  • Architect the ML Ecosystem.
  • Productionize Innovation.
  • Engineer Feature Intelligence.

TrueML is a mission-driven financial software company that aims to create better customer experiences for distressed borrowers. The TrueML team includes inspired data scientists, financial services industry experts and customer experience fanatics building technology to serve people.

$150,000–$180,000/yr
US

  • Design and implementation of reliable, maintainable, and scalable GenAI systems.
  • Serve as a subject matter expert for machine learning systems owned by the team.
  • Mentor junior and mid level engineers through code reviews and design collaboration.

Trajector specializes in medical evidence services, guiding clients through disability benefits complexities. They are a global team of over 1,800 dedicated individuals, streamlining the path to benefits and ensuring access to rightful compensation for those with disabilities.

Canada 3w PTO

  • Own model serving: Design, build, and maintain low-latency, highly-available serving stacks for in-house ML model serving and integrating with LLM serving partners.
  • Automate training pipelines: Orchestrate data prep, training, evaluation, and registry workflows on Kubernetes with solid MLOps practices.
  • Optimize at scale: Profile and tune throughput, memory, and cost; introduce caching, sharding, batching, and GPU/CPU autoscaling where it pays off.

Cresta aims to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Their platform combines AI and human intelligence to help contact centers discover customer insights and automate conversations.

$155,000–$170,000/yr
US Canada Europe UK

  • Design, deploy, and manage scalable and highly available cloud infrastructure on AWS.
  • Design reusable Terraform/OpenTofu modules following DRY principles and organizational standards.
  • Implement AIOps practices, leveraging AI tools to enhance monitoring, incident response, and predictive alerting.

DistroKid is the world’s largest distributor of music to Spotify, Apple Music, YouTube, and beyond, empowering millions of independent artists to get their music into streaming services and keep 100% of their earnings. They move fast, stay curious, and build tools that directly impact how artists share their music with the world.

US Global

  • Automate infrastructure provisioning and configuration using Infrastructure-as-Code (Terraform)
  • Develop, implement, and optimize CI/CD pipelines (GitHub Actions, ArgoCD)
  • Manage Kubernetes clusters (EKS, GKE)

Verve For Advertisers empowers brands and agencies to connect moments of discovery and drive measurable outcomes across screens. They bring together the largest on-site search intent dataset outside of walled gardens, direct SDK integrations with top apps, alongside data partnerships with 3M+ websites and LLMs.

Asia Australia Japan South Korea

  • Deploy, configure, and manage blockchain networks (e.g., Bitcoin, Ethereum, Solana)
  • Design and implement cloud infrastructure on AWS in line with best practices.
  • Administer and scale Kubernetes clusters (EKS) for deploying blockchain nodes and related services.

Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. Trusted by 300+ million people in 100+ countries, they offer trading, finance, education, research, payments, institutional services, Web3 features, and more.

Global

  • Build Reliable Cloud Infrastructure: Implement and maintain AWS infrastructure using Terraform across EKS, Lambda, EC2, and S3.
  • Improve Developer Workflows: Contribute to CI/CD pipelines, starter kits, and internal tooling that reduce manual effort and improve deployment confidence.
  • Strengthen Observability & Operations: Add monitoring, logging, and alerting (DataDog) to platform services and participate in an on-call rotation.

Spreetail helps brands increase their ecommerce market share globally while improving operational costs. They are building one of the fastest-growing ecommerce companies in history with a focus on innovation.

$165,000–$195,000/yr
US

  • Support and operate Legion’s AWS-based cloud platform and Kubernetes (EKS) environments.
  • Build and maintain infrastructure-as-code using Terraform.
  • Improve CI/CD pipelines to increase deployment safety and velocity.

Legion Technologies delivers the industry’s most innovative workforce management platform. The AI-driven Legion WFM platform maximizes labor efficiency and employee engagement. They are a remote, mission-driven team that embraces a collaborative, fast-paced, and entrepreneurial culture.

US

  • Architect and lead end-to-end ML/AI pipelines.
  • Design and own CI/CD pipelines for ML/AI workflows.
  • Build and maintain scalable ML/AI infrastructure on cloud platforms.

Equip is a virtual, evidence-based eating disorder treatment program. They aim to ensure everyone can access effective treatment, operating in all 50 states and partnering with major health insurance plans, recognized for its engaged culture and sustainable treatment.

US

  • Implement cloud infrastructure, automation, and DevOps best practices.
  • Support platform and engineering teams specializing in AWS Bedrock AgentCore.
  • Contribute to building & maintaining CI/CD pipelines using Bitbucket Pipelines.

Nagarro is a Digital Product Engineering company scaling rapidly. They build products, services, and experiences that inspire, excite, and delight, operating across all devices and digital mediums with over 17000 experts across 39 countries, fostering a dynamic and non-hierarchical work culture.

$137,000–$180,000/yr
US

  • Design, develop, and maintain high-performance, scalable, and secure backend services, primarily using Python and frameworks like FastAPI
  • Translate ambiguous business and technical requirements into concrete software designs and actionable tasks for cross-functional teams
  • Operate and maintain production applications at scale, ensuring high availability, performance, and reliability

SmartAsset is an online destination for consumer-focused financial information and advice, whose mission is helping people make smart financial decisions, reaching over an estimated 59 million people each month. Valued at over $1 billion, SmartAsset has earned recognition on the Inc. 5000 and Deloitte Technology Fast 500 lists.

US

  • Work with customers to develop requirements and scope for new AI/ML projects.
  • Develop computer vision and machine learning based solutions for inspection platforms.
  • Analyze large datasets to extract meaningful insights and drive business decisions.

Loram provides advanced insights into inspection data collected for customers worldwide. The company has a small, collaborative team managing the entire project lifecycle, offering employees an outsized impact on inspections and maintenance recommendations.

$3,490–$4,363/mo

  • Design, build, and operate scalable and high performance cloud infrastructure on AWS
  • Manage infrastructure as code using Terraform, Terragrunt, and CloudFormation
  • Develop and maintain CI/CD pipelines using GitLab CI/CD

BlueAlly's mission is to make technology more accessible, more certain, and more impactful for every organization. They specialize in cloud, cybersecurity, infrastructure, and application modernization, thriving on cutting-edge technologies and services.

Global

  • You will plan and execute infrastructure deployments, using automation to ensure a stable platform.
  • You will manage operations, troubleshoot, and optimize workflows to maintain high availability.
  • You will own backend features supporting our platforms and interface with users for feedback.

Trust Wallet is the leading non-custodial cryptocurrency wallet, trusted by over 200 million people worldwide to securely manage and grow their digital assets. They aim to give individuals the opportunity to own their assets and participate in the future economy.

$205,000–$270,000/yr
US Unlimited PTO

  • Partner with engineers to build dev tools that empower developer workflows and deployment infrastructure.
  • Ensure reliability of multi-cloud Kubernetes clusters and pipelines.
  • Focus on automation so we can spend energy where it matters.

Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Their platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices.

US

  • Design and maintain scalable cloud environments using tools like Terraform, CloudFormation, or Ansible.
  • Build and optimize automated deployment pipelines to ensure rapid and reliable software delivery.
  • Implement robust monitoring, logging, and alerting frameworks to ensure 24/7 system health.

CodeRoad offers end-to-end software development services, helping businesses scale with infrastructure solutions. They provide staff augmentation, dedicated IT teams, and software engineering to empower businesses in a digital landscape.