Source Job

Global Unlimited PTO

  • Design and implement the CRD API, network configuration patterns, and observability integrations for a Kubernetes operator in a production Rust environment.
  • Harden the operator's security posture and ensure it is deployable across all major enterprise Kubernetes distributions, including air-gapped and OpenShift environments.
  • Build operational tooling for support at scale and partner with customer-facing teams to understand and reflect real-world enterprise deployment requirements.

Rust Kubernetes DevOps Distributed Systems Security

20 jobs similar to Senior Software Engineer - Kubernetes Operator

Jobs ranked by similarity.

Spain 6w PTO

  • Operating and evolving 100+ multi-cloud streaming clusters and related database infrastructure.
  • Diagnosing and eliminating cross-layer failure modes.
  • Designing safe upgrade and rollout strategies at scale.

Grafana Labs is a remote-first, open-source powerhouse with over 20M users of Grafana, its open source visualization tool. Grafana Labs helps more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, and its team thrives in an innovation-driven environment.

Europe 5w PTO

  • Define and drive the roadmap for deployment, configuration, infrastructure, and operational tooling across cloud and on-premise environments.
  • Work closely with engineering, design, customer-facing teams, and customers to identify and resolve deployment friction.
  • Improve how enterprise customers install, configure, upgrade, secure, and operate Rasa in production.

Rasa is a leader in generative conversational AI, enabling enterprises to build and deliver next-level AI assistants. The company was founded in 2016 and is remote-first with a global presence.

Global

  • Design and implement infrastructure and tools that empower our product teams to rapidly and securely iterate, emphasizing reliability and automation.
  • Influence the strategic direction of our infrastructure and operational practices, ensuring that we are well-positioned to scale and support our growing organization.
  • Take a proactive role in the resolution of production issues, ensuring that we are well-prepared to handle incidents and that we learn from them in a blameless manner.

SSV Labs is the core team behind the SSV Network - pioneering decentralized infrastructure for Ethereum staking. They are building tools, protocols, and standards to make staking more secure, scalable, and trustless.

$65,000–$65,000/yr
Europe

  • Build and develop our operator-based platform on Kubernetes.
  • Work on existing operators and design new ones as we extend the platform.
  • Create self-service solutions across multiple Kubernetes clusters.

REWE Group Austria's IT department develops innovative IT products and services for its corporate divisions in Austria and abroad, setting the tone for modern trade. They have over 700 employees.

Global

  • Design, architect, implement, review, and test frameworks, libraries, tools, and services primarily using Go.
  • Participate in requirement, design, planning, and retrospective meetings as an integral part of an Agile software development team.
  • Contribute to successful sprints by implementing items contributing to overall team goals.

Mirantis is the Kubernetes-native AI infrastructure company, enabling organizations to build and operate scalable, secure, and sovereign infrastructure for modern AI, machine learning, and data-intensive applications. By combining open source innovation with deep expertise in Kubernetes orchestration, Mirantis empowers platform engineering teams to deliver composable, production-ready developer platforms across any environment.

US India

  • Operate and improve platform tools so product teams can ship reliably, triaging tickets, fixing build issues, and handling routine service requests.
  • Maintain and extend self-service workflows by updating docs, examples, and guardrails under guidance from senior engineers.
  • Perform day-to-day Kubernetes operations: deploy/update Helm charts, manage namespaces, diagnose rollout issues, and follow runbooks for incident response.

ISHIR is a digital innovation and enterprise AI services provider. They work with startups and enterprises to shape the future through accelerated innovation, deep technical expertise, access to global digital talent and a passion for complex problem-solving. ISHIR attracts proactive individuals who thrive on challenges and promote self-reliance, open communication, and collaboration.

Global

  • Design and implement a shared core layer used across desktop, mobile, and backend
  • Contribute to architecture decisions around CRDT and distributed systems
  • Collaborate closely with frontend, mobile, and backend engineers to integrate Rust components

Termius is redefining how engineers interact with remote systems. Millions of engineers and thousands of companies rely on Termius worldwide. As a leading cross-platform SSH client, their mission is to boost productivity and foster collaboration by rebuilding the Terminal for the modern era.

$160,000–$200,000/yr
US Unlimited PTO

  • Maintain, optimize, and enhance on-premises and cloud computing environments.
  • Execute technical aspects of implementation projects, ensuring seamless software integration and customization.
  • Automate Infrastructure-as-Code (IaC) to manage virtual machines and deploy containers, services, and other infrastructure.

Striveworks helps organizations harness AI to solve national security and business challenges, acting as a command center for data and models. Founded by data scientists and engineers, they aim to simplify the deployment and optimization of AI systems, ensuring reliability and scalability.

Global

  • Design and implement high-throughput, low-latency services and libraries in Rust and TypeScript
  • Manage the full lifecycle of on-chain wallets across blockchains - from derivation to ownership, at the scale of a global exchange
  • Build and maintain the systems that ensure liquidity for clients while maintaining strict compliance with regulatory requirements

Kraken is a mission-focused company rooted in crypto values, aiming to accelerate the global adoption of crypto for financial freedom and inclusion. As a fully remote company with Krakenites in 70+ countries, they develop premium crypto products for experienced traders, institutions, and newcomers.

$151,038–$234,109/yr
US

  • Design, implement, and maintain our organization's cloud infrastructure, including CI/CD pipelines, automation tools, and monitoring systems in AWS.
  • Work closely with development teams to ensure that our applications are reliable and scalable in a secure and compliant manner.
  • Own the deployment and maintenance of Kubernetes clusters, ensuring efficient resource utilization, scalability, and high availability.

Triumph is a financial and technology company focused on payments, factoring, intelligence and banking, pioneering solutions for the transportation industry. As a member of the Triumph team, you’re at the heart of an innovative, forward-thinking company that values collaboration, creativity and continuous learning.

$180,000–$240,000/yr
US

  • Develop and maintain a multi-platform implant written in Rust.
  • Build and extend C2 server infrastructure, task dispatch, and communications protocols.
  • Research and implement AV/EDR evasion techniques to keep tooling operational against modern defenses

Horizon3.ai is a remote cybersecurity company dedicated to enabling organizations to proactively find, fix, and verify exploitable attack vectors. They are a fusion of former U.S. Special Operations cyber operators and startup engineers committed to solving common security problems.

Australia 5w PTO

  • Own the design, deployment and operation of OpenStack and Kubernetes environments.
  • Build and improve infrastructure using infrastructure-as-code and GitOps practices.
  • Optimise GPU workload scheduling using Kubernetes and NVIDIA tooling.

NexGen Cloud is building next-generation GPU cloud infrastructure, and is the company behind Hyperstack, a high-performance cloud platform designed for compute-intensive workloads. We're a scale-up by design, solving complex infrastructure challenges at pace, with real-world impact.

Canada EMEA Unlimited PTO

  • Evolve ArgoCD GitOps standards across environments
  • Build reusable Terraform modules and practices for safe, repeatable cloud infrastructure provisioning and drift detection
  • Lead the operation and evolution of production-grade Kubernetes clusters across cloud environments

GitLab is the intelligent orchestration platform for DevSecOps. More than 50 million registered users and more than 50% of the Fortune 100 trust GitLab to ship better, more secure software faster.

$150,000–$180,000/yr
US

  • Own the architecture, development, and operation of scalable, secure, and fault-tolerant cloud services.
  • Drive technical design and architectural decisions for distributed systems, influencing patterns, standards, and long-term platform evolution.
  • Lead complex initiatives end-to-end, from design through deployment and ongoing optimization.

ExtraHop is a company focused on reinventing Network Detection and Response (NDR) to offer enterprises unparalleled visibility, context, and control against emerging threats. They integrate NDR with Network Performance Management (NPM), Intrusion Detection Systems (IDS), and forensics, providing a single, comprehensive solution.

US

  • Design, build, and operate core cloud infrastructure across compute, storage, databases, and networking layers.
  • Own and improve the reliability, scalability, and security of Valon’s production systems as we scale to support major enterprise deployments.
  • Evaluate, adopt, and operationalize new infrastructure technologies (e.g., Vitess, Clickhouse, Redis) to meet evolving product and scale requirements.

Valon is building the AI-native operating system for regulated finance, starting with mortgage servicing. They are a Series C company backed by a16z, transforming industries that others have written off as too complex to innovate.

Europe

  • Design, build, and maintain scalable, highly available and fault-tolerant infrastructures.
  • Implement and improve monitoring, alerting, and incident response systems to ensure optimal system performance and minimize downtime.
  • Drive continuous improvement in infrastructure automation, deployment, and orchestration.

Mistral AI is dedicated to democratizing AI through high-performance, optimized, open-source models, products, and solutions designed to integrate seamlessly into daily working life. They are a dynamic, collaborative team passionate about AI and its potential to transform society dedicated to innovation.

Global

  • Build security tools and controls that are deployed across the company
  • Design, develop, and deploy new core security features to public Chainlink products like the Chainlink core node
  • Define new processes and systems that make attacks on our networks hard to execute and easy to detect

Chainlink Labs is the industry-standard oracle platform bringing the capital markets onchain and powering the majority of decentralized finance (DeFi). Many of the world’s largest financial services institutions have also adopted Chainlink’s standards and infrastructure.

US

  • Rackner is seeking an DevSecOps (Kubernetes) Engineer SME to support a US Air Force program called Platform One.
  • Big Bang provides the tooling for mission application owners to create a Platform as a Service in their own Kubernetes cluster running in a cloud or datacenter.
  • We're looking for a DevSecOps Engineer who has deep experience in Kubernetes, Terraform and CI/CD Pipelines to join our team.

Rackner is a software consultancy that builds cloud-native solutions for startups, enterprises, and the public sector. They are an energetic, growing consultancy with a passion for solving big problems for both startups and enterprises.

Global

  • Build scalable, high-quality applications using Rust.
  • Work on cutting-edge data and cloud technologies.
  • Contribute to a fully remote and globally distributed team.

OpenObserve is a rapidly expanding open-source observability platform. We unify logs, metrics, and traces into one deployable system and are supported by leading investors.

Europe 5w PTO

  • Work with other Engineering teams to design sustainable infrastructure and microservice solutions.
  • Automate tools and infrastructure to reduce manual work.
  • Monitor applications and participate in an on-call rotation as required.

Bloomreach is building the world’s premier agentic platform for personalization, revolutionizing how businesses connect with their customers by building and deploying AI agents to personalize the entire customer journey. They power personalization for more than 1,400 global brands.