Remote Devops Jobs β€’ Go

28 results

Job listings

GPU Cluster Architect

Nebius β˜οΈπŸ’‘πŸŒ
$150,000–$180,000
USD/year
US 12w paternity

Drive the design of our next-generation AI infrastructure. In this high-impact, hands-on role, you will make end-to-end architectural decisions across compute, networking, and storage β€” ensuring our platforms can meet the massive scale, performance, and reliability requirements of modern AI workloads. This is a high-impact architecture role where you’ll define how tens of thousands of GPUs are interconnected optimized across multiple data center sites.

Improve Scalable's cloud infrastructure and automation using tools such as AWS, Terraform, and languages like Python or Go. Design, maintain, and operate multiple cloud networks on AWS, providing a secure and highly available infrastructure. you will maintain financial middleware systems, ensuring high-availability connectivity for our most critical applications and partner integrations. Strengthen our DevOps culture.

Senior Cloud Performance Engineer

ClickHouse πŸ’‘πŸ’‘πŸ’‘

Looking for a Senior Cloud Performance Engineer to build cloud native ClickHouse Cloud Platform. The ideal candidate will have experience with database benchmarking, test automation, system engineering, performance analysis, and capacity management. This role offers the opportunity to make a significant impact on our elastic, limitless scale, high-performance, server less clickHouse Cloud.

Team Lead, Site Reliability Engineering

Pythian πŸ“Šβ˜οΈπŸ€

As a Team Lead, you’ll be responsible for leading a team of site reliability engineers that are designing, deploying, and operating large-scale distributed systems across compute, storage, networking, and AI/ML environments. You will act as the primary technical escalation point, oversee day-to-day operational delivery, mentor and coach team members, and ensure adherence to SLAs and quality standards.

Senior Site Reliability Engineer, Environment Automation

GitLab πŸ’»πŸ’‘πŸ‘©

As a Senior Site Reliability Engineer (SRE) at GitLab, you’ll help keep all user-facing services and production systems reliable, scalable, and efficient. Our SREs combine a pragmatic operations mindset with strong software engineering practices to drive automation, reduce toil, and improve resilience across our platform. This position centers on automating the lifecycle of many tenant environments, ensuring they remain secure, consistent, and reliable at scale.

Site Reliability Engineer

Virta Health βš•οΈβš•οΈβš•οΈ
$167,249–$216,000
USD/year

Build the foundation that will help our company move as fast as possible while meeting security and compliance requirements. You will be one of the key people defining and driving the future vision of what reliability and observability should look like. Responsibilities include shipping automation and tooling that reduces toil, with high-quality, well-structured code, design and codify self-healing workflows and guardrails to minimize toil and improve reliability.

Senior Fullstack Engineer, Platform

CapIntel πŸ“ŠπŸ§ πŸ’‘

CapIntel is looking for a Senior Fullstack Engineer to strengthen the architecture, reliability, and scalability of their systems. This role focuses on platform and tooling, improving developer experience and supporting secure, efficient product delivery. You'll collaborate closely with product teams and may contribute to service design or integration when needed.

Infrastructure Engineer

Docker 🐳🐳🐳

The Infrastructure Engineering team is the backbone of Docker’s cloud-native platform, powering products like Docker Hub and Docker Build Cloud for millions of developers worldwide. The team designs, builds, and operates the infrastructure services and platforms that make Docker fast, reliable, and secure at global scale. They own core building blocks like compute, networking, observability, deployment, security, and cloud infra provisioning.