We are seeking a DevOps (MLOps) Engineer to pioneer and mature our machine learning operations capabilities, focusing on building robust infrastructure and deployment pipelines in AWS. This critical role will own the infrastructure underlying all of our productionalized ML models, from deployment to monitoring, fostering seamless collaboration between our machine learning and broader engineering teams. The MLOps engineer will work closely with ML engineers to keep state-of-the-art, mission-critical ML models up-to-date, scalable, and observable.
You will design and implement scalable CI/CD pipelines for machine learning models within our AWS infrastructure, driving automated builds, deployments, and engineering excellence. Establish and evolve ML operational best practices in a greenfield environment, defining the standards for model versioning, reproducibility, and MLOps maturity. You'll implement and manage comprehensive monitoring and observability solutions for deployed ML models using tools like Datadog, ensuring high performance, accuracy, and quick issue resolution. You will also maintain and optimize ML model repositories to ensure efficient versioning and management of all model artifacts throughout their lifecycle.