Design and implement solutions to problems of scale for multi-site deployment and management of CoreWeaveโs global server hardware fleet. Build and maintain backend services and APIs (gRPC/REST) in Go or Python to interact with Kubernetes and other infrastructure systems. Develop provisioning services, automation workflows, and fleet management tools that span from bare metal to container orchestration.
Job listings
We are looking for a skilled and motivated Lead Infrastructure Engineer to lead our Platform Engineering team. As the team leader, you will direct the planning, design, development, and implementation of our platform architecture, ensuring it meets the needs of our growing product portfolio. You will guide a talented team of engineers, driving best practices and fostering a culture of excellence and innovation.
The Reliability Engineering team helps realize our vision by supporting Coinbase engineering teams to build software that is world-class in terms of its reliability. As a core service team, Coinbase Reliability Engineers work closely with the rest of engineering. Improve observability, reliability and availability by defining and measuring key metrics. Build automation and improve systems to eliminate toil and operations work.
Implementation and maintenance of scalable solutions following DevOps best practices in a leading Gaming industry company. Process automation using Terraform or AWS CloudFormation, enabling creation and management of AWS infrastructure and applications in an internal on-prem cluster. Building and maintaining CI/CD pipelines, automation of tests, deployments, and rollback strategies.
As a Platform engineer, MLOps, you will be critical to deploying and managing cutting-edge infrastructure crucial for AI/ML operations, and you will collaborate with AI/ML engineers and researchers to develop a robust CI/CD pipeline that supports safe and reproducible experiments. Your expertise will also extend to setting up and maintaining monitoring, logging, and alerting systems to oversee extensive training runs and client-facing APIs.
Be part of a dynamic team that is shaping the future of energy and technology. Build and maintain backend systems and data pipelines for AI-based software platforms, integrating SQL/NoSQL databases and collaborating with engineering teams to enhance performance. Design, deploy, and optimize cloud infrastructure on Google Cloud Platform, including Kubernetes clusters, virtual machines, and cost-effective scalable architecture.
As a Platform engineer, MLOps, you will be critical to deploying and managing cutting-edge infrastructure crucial for AI/ML operations, collaborating with AI/ML engineers and researchers to develop a robust CI/CD pipeline that supports safe and reproducible experiments. Your expertise will extend to setting up and maintaining monitoring, logging, and alerting systems to oversee extensive training runs and client-facing APIs. You will ensure that training environments are optimally available and efficiently managed across multiple clusters, enhancing our containerization and orchestration systems with advanced tools like Docker and Kubernetes.
As a Senior Infrastructure Engineer, you'll drive the technological advancement of our Infrastructure Platform within Raya. Leveraging best-in-class cloud technologies and container orchestration, your work will be central to delivering a reliable, secure, and scalable foundation that our product teams depend on daily. You can design infrastructure that maximizes application efficiency and resource utilization. You're enthusiastic about leveraging AI to enhance your own infrastructure workflows.
As a Devops Engineer Principal at Sagent, you will play a pivotal role in shaping and executing our cloud strategy. We seek an exceptional individual with a passion for cloud technologies and a proven expertise in designing and implementing complex, scalable, and secure cloud solutions. You will collaborate with senior leadership to develop and refine the companyโs cloud strategy, ensuring alignment with business goals.
As a Cloud Engineer, you will utilize your extensive Cloud/ DevOps knowledge and experience to design, develop and deploy solutions utilizing the Cloud/DevOps platforms. You will work with customer service owners, process owners and various service delivery groups and participate in demos and meetings in a professional and courteous manner. The Senior Cloud Engineer is a highly experienced subject matter expert on the Cloud/ DevOps platforms with strong experience designing, developing and deploying integrations with external third-party tools.