Design and implement solutions to problems of scale for multi-site deployment and management of CoreWeaveβs global server hardware fleet. Build and maintain backend services and APIs (gRPC/REST) in Go or Python to interact with Kubernetes and other infrastructure systems. Develop provisioning services, automation workflows, and fleet management tools that span from bare metal to container orchestration.
Job listings
Lead the team responsible for the operational reliability of our bare metal infrastructure, networking, and system configuration that powers our product offerings in this hands-on "player/coach" role. You will help shape a critical function in a growing company, evolving the Network Operations Center (NOC) into a modern, proactive SRE function that leverages automation, data science, and reliability engineering principles.
Play a key role in building our Developer Experience team and owning critical infrastructure and services that support engineering across the organization. Define best practices, shape internal guidelines, and lead efforts to improve developer workflows, tooling, and system reliability. This role involves mentoring engineers, conducting code reviews, and delivering high-impact projects that power our core systems and servicesβultimately enabling faster, safer, and more scalable product development.
Be part of a dynamic team that is shaping the future of energy and technology. Build and maintain backend systems and data pipelines for AI-based software platforms, integrating SQL/NoSQL databases and collaborating with engineering teams to enhance performance. Design, deploy, and optimize cloud infrastructure on Google Cloud Platform, including Kubernetes clusters, virtual machines, and cost-effective scalable architecture.
As a Cloud Engineer III (CEIII) you will provide engineering and transformation expertise to modernize the environment, being responsible for solutions design of the modernization cloud environment. Working with a team to plan, review requirements, design, and develop the most optimal and secure cloud-based solutions. The engineer must stay current with best practices, industry standards, and make recommendations as needed.
As a key contributor, you will design and support resilient systems, prioritizing high performance, availability, and throughput, with a focus on minimizing service disruptions, downtime, and latency. Youβll work with engineering teams ranging from product development, developer experience, and backend infrastructure to collaboratively build Thumbtackβs ecosystem of platform services that have the right impact at the right time.
As a DevOps Engineer / Site Reliability Engineer on the Veradigm Payer Dev Ops team, youβll work closely with Business and Technical Leaders from across the organization to manage and monitor our Azure based cloud solutions, focusing on automation, scalability, availability, and security. You will be responsible for designing and building our Cloud infrastructure in Azure, creating a foundation for success in managing systems in a public cloud setting.
Cribl Inc is seeking a Principal Site Reliability Engineer to join their mission to unlock the value of all observability data, providing users a new level of observability, intelligence and control over their real-time data. This role is remote and you will be part of the engineering organization where you will contribute in their efforts to envision, create, deploy, test, and ship Cribl products.
As a Forward Deployed Engineer, you will be driving one of the most critical outcomes for our business by doing the complex engineering work required to deploy a complex kubernetes microservice architecture onto classified cloud environments. Youβll be a cross functional expert on software development, DevSecOps, cloud infrastructure, and government accreditation processes. Youβll work closely with fellow Software Engineers, Product Managers, Cybersecurity professionals, and end users to deliver cutting edge capability to the warfighter.
Easypost is seeking a highly experienced and skilled Senior Engineer to work with our DevOps team. This role will be involved in designing, building, and optimizing our cloud infrastructure, ensuring scalability, reliability, and high availability in a multi-Cloud environment. The ideal candidate will have deep expertise in cloud platforms and a strong background in DevOps and automation.