The company is looking for a Senior Machine Learning Ops Engineer to help build the systems behind smart automation at airSlate. They use AWS SageMaker to build and deploy traditional AI models, while also developing novel applications powered by large language models. The role involves being a driver for ML service development, reviewing architecture, creating MLOps practice, and consulting teams.
Job listings
Assist in Observability Implementation: Support the development and maintenance of monitoring, logging, and tracing solutions. Monitor & Manage Observability Tools: Help deploy and manage observability platforms. Support Distributed Tracing & Telemetry: Work with OpenTelemetry to collect and export telemetry data for better system insights and debugging. Optimize Logging & Metrics Collection: Assist in implementing structured logging and improving system performance monitoring.
We are seeking an experienced Cloud Engineering Architect to design, build, and manage cloud solutions that align with business needs and industry best practices. The ideal candidate will have a deep understanding of AWS architecture, infrastructure automation, and cloud strategy development, ensuring high-performance, scalable, and secure cloud environments. This role requires strong technical expertise, problem-solving skills, and the ability to collaborate with cross-functional teams.
Responsible for ensuring the Reliability, Performance, Efficiency and Resilience of your team's systems and services, as well as working to ensure that the experience of your customers β other internal engineering teams β steadily improves. Includes implementing and maintaining monitoring systems, collaborating with cross-functional teams to address performance bottlenecks and continuously improving the reliability and scalability of our systems to meet the evolving needs of our users.
As a Solutions Architect at Sauce Labs, you will serve as a technical expert and trusted advisor to our customers, driving successful adoption of our products and enabling long-term success. Youβll partner cross-functionally with Sales, Customer Success, Product, and Engineering to ensure our solutions align with customer technical and business goals. This role blends deep technical knowledge with strong business acumen and consulting skills.
Optimize critical engineering applications to ensure reliability, scalability and security. Establish tooling and automation processes for infrastructure as code, service recovery and service monitoring. Provide guidance and standards on application capabilities and usage. Develop and maintain infrastructure and security guidelines, procedures, and documentation to streamline operations and promote knowledge sharing.
Shape the future of AI with infrastructure. Youβll use your expertise in cloud platforms, CI/CD, and infrastructure automation to ensure scalable, secure, and efficient environments for building cutting-edge AI applications. Design and maintain cloud-based infrastructure (AWS, GCP, or Azure) for AI development pipelines.
Manage production incidents and conduct post-mortems to improve system stability and reliability. Collaborate closely with development teams to ensure smooth application and system deployments. Maintain and optimize cloud infrastructure (e.g., AWS, Alicloud) for performance, cost-efficiency, and availability. Administer and scale Kubernetes clusters (primarily EKS) to support web service deployments. Design, build, and maintain CI/CD pipelines using tools such as GitHub Actions, ArgoCD, and AI/LLM-based automation. Automate infrastructure provisioning and management with tools like Terraform and Python.
Seeking a Senior Site Reliability Engineer with experience in AWS, system monitoring, and infrastructure automation to maintain and improve a cloud-based lending platform. The role involves ensuring reliability, monitoring systems, building software, and optimizing performance. Partner with development teams to improve services through rigorous testing and release procedures.
Design, implement and maintain suitable infrastructure and applications on AWS public cloud environments using DevOps mindset. Bring world class cloud-native infrastructure & automation expertise to implement solutions for deployment, monitoring & remediation in an automated fashion.