Remote Software engineering Jobs · Kubernetes

Job listings

  • Manage and optimise Kubernetes clusters in GKE through Terraform.
  • Design and implement automation strategies that empower developers to self-serve.
  • Serve as the technical point-of-contact for GCP and Kubernetes-related queries.

Prolific builds the human data infrastructure that reshapes AI development by enabling collection of high-quality, ethically sourced human behavioral data. They are a mission-driven company with a competitive salary and benefits, offering remote working within a culture focused on impact and innovation.

Unlimited PTO

  • Own the full lifecycle of customer issues for ClickHouse strategic accounts on Alibaba Cloud, from intake through resolution.
  • Diagnose and resolve issues including SQL query performance, schema design, replication, and cluster operations.
  • Advise customers on best practices, capacity planning, and configuration tuning for cloud-based workloads.

ClickHouse is a real-time analytics and data warehousing platform recognized on the 2025 Forbes Cloud 100 list. With over 3,000 customers and ARR growing over 250% year over year, the company has raised $400M in Series D financing and fosters a globally distributed, remote-friendly culture.

Unlimited PTO

  • Design, implement, destroy, and rebuild next-generation micro-frontend-monoliths.
  • Process exabytes of un-indexed, corrupted JSON, XML, YAML, and CSV files simultaneously.
  • Maintain, refactor, and pray over a COBOL codebase written in 1974.

This is a stress-test job description for an enterprise-level architect role. The company is likely large and focuses on cutting-edge technology, with a culture of high challenge and academic-style rigor.

UK 6w PTO

  • Act as a trusted technical partner, guiding organizations through onboarding, implementation, and expansion with white-glove support and best practices.
  • Deliver high-impact training, jumpstart engagements, and provide tailored technical consulting to help customers succeed.
  • Identify recurring issues, monitor support needs, and advocate for product improvements in close collaboration with internal teams.

Grafana Labs is the company behind Grafana, the open observability platform. With over 1,600 team members across 40+ countries, we are a 100% remote company backed by leading investors and trusted by more than 35 million users and 7,000+ customers.

  • Act as the primary NVIDIA AI Enterprise and vector database expert for HyperPOD customer environments, owning end-to-end triage across GPU, NVAIE services, and storage.
  • Author and maintain support triage runbooks, diagnostics bundles, and collaborate on observability dashboards for platform health and RAG metrics.
  • Build hands-on labs, PoCs, and reusable technical assets to accelerate support readiness and partner success.

DataDirect Networks (DDN) is a global market leader in AI and high-performance data storage, powering many of the world's most demanding AI data centers across industries like life sciences, healthcare, financial services, and research. They are a global company with strong innovation, customer-centricity, and a team of passionate professionals committed to shaping the future of AI and data management.

  • Design and develop scalable, high-performance, low-latency backend services using Java and AWS technologies.
  • Collaborate with product, frontend, and QA teams to define technical requirements and ensure smooth integration with other platform components.
  • Optimize existing services for maximum performance, reliability, and maintainability while implementing CI/CD best practices.

Oscilar builds an advanced AI Risk Decisioning Platform that helps banks, fintechs, and digital organizations manage fraud, credit, and compliance risk. The team includes industry veterans from Meta, Uber, Citi, and Confluent, and operates with a mission-driven culture emphasizing ownership and innovation.

  • Partner with Sales to support strategic enterprise opportunities, leading technical discovery and solution design.
  • Help customers evaluate Firefox Enterprise for deployment, policy management, identity integration, security controls, and operational fit.
  • Create reusable technical assets such as reference architectures, deployment guidance, and validation plans.

Mozilla Corporation is a non-profit-backed technology company that has shaped the internet for the past 25 years, known for the privacy-focused Firefox browser used by over 225 million people monthly. They are wholly owned by the non-profit Mozilla Foundation, accountable to their mission, and alongside thousands of contributors, they build open-source software that empowers users.

  • Build and operate the internal engineering platform that provides application engineers with the tools, systems, and Kubernetes clusters they need to deploy and run their workloads.
  • Focus on cloud infrastructure, capacity management, security, engineering productivity, monitoring, and US Federal compliance across squads.
  • Participate in on-call rotations to ensure the health of the system and understand how people use our products.

Grafana Labs, the company behind the open observability cloud, is founded on the principles of open source, open standards, open ecosystems, and open culture. We are a 100% remote company with 1,600+ team members across 40+ countries, backed by leading investors including Lightspeed Venture Partners, Sequoia Capital, GIC, Coatue, J.P. Morgan, CapitalG, and Lead Edge Capital.

  • Lead the design, development, and implementation of microservices for the next-generation enterprise campaign management platform.
  • Architect scalable and reliable solutions while collaborating with cross-functional teams including Product Managers and Quality Engineers.
  • Provide technical guidance, conduct code reviews, and stay current with emerging technologies to ensure high code quality and performance.

Miratech is a global IT services and consulting company that helps visionaries change the world. With nearly 1,000 full-time professionals and over 25% annual growth, the company operates across 5 continents and 25+ countries, maintaining a culture of Relentless Performance with a 99% project success rate since 1989.

  • Lead and manage a team of engineering managers and software engineers, supporting their growth and performance through regular feedback and coaching.
  • Create a long-term technical roadmap, establish OKRs, and drive continuous improvement in engineering processes.
  • Collaborate across teams to ensure technical sustainability, manage priorities, and build a strong engineering culture.

Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest. They are a large remote-first financial technology company with a focus on innovation and people.