Provide and own automation of the provisioning of CSP resources, including networking, Kubernetes clusters and specific CSP resources required by our application teams.
Work with users (Grafana Cloud application teams) to help understand their needs and ensure investment in the right capabilities.
Participate in the Platform department Infrastructure wing on-call rotation.
Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana around the globe. The team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything that they do.
Design and implement high-quality, scalable services to be consumed by multiple Grafana Cloud products.
Support the technical direction and vision of the team, contributing to strategic discussions and future development of observability solutions
Be a part of your team’s follow-the-sun on-call rotations and take ownership of the services you’re running
Grafana Labs is a remote-first, open-source powerhouse that provides the leading open source visualization tool. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack. The team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything we do.
Design and implement new features that make our container images even more secure, performant, and scalable.
Develop robust internal systems that scale with us—everything from build pipelines to container registries to custom tooling that only we could dream up.
Define standards, write tests, and ship tooling that helps us stay high-quality and high-velocity at the same time.
Chainguard is the secure foundation for software development and deployment. They provide guarded open source software and have built the largest library of open source software that is secure by default, and is focused on hiring “Chainguardians'' with unique backgrounds, perspectives, and experiences.
Contribute to the core product, working across the stack on services that power their applications.
Design and refine technical systems, sharing ownership of customer use cases and the systems that power them.
Work directly with customers, providing pointers to documentation and debugging issues.
Humanitec is reshaping how enterprises build and run their cloud-native setups, and leading the transformation, helping teams build Internal Developer Platforms (IDPs) that unlock true developer self-service. They value humility, drive, and intelligence in their fully remote team.
Develop, test, and deploy code and configuration to support internal deployment tooling infrastructure
Write tickets, spikes, and runbooks for the team, as well as internal product documentation for Twilio
Operationalize Harness as a unified deployment platform, supporting standardized pipeline templates
Twilio is shaping the future of communications. They deliver innovative solutions to hundreds of thousands of businesses and empower millions of developers worldwide. Twilio has a strong culture of connection and global inclusion, making a global impact each day.
Make deployments boring (in the best way possible)
Own CI/CD pipelines: optimize build times, improve caching, reduce flakiness
Evolve our Kubernetes (EKS) deployment strategy for reliability and speed
Obvious is building an AI-native workspace, an operating system for work that puts co-intelligence at the center. They are a small and talent-dense team with world-class builders, former founders, and leaders from companies like Netflix, Google, and Meta.
Maintain the Field Engineering infrastructure, including the pre-sales Demo Kit application and infrastructure.
Design, develop, and deliver compelling product demos to add to the demo kit library.
Create and deliver Training Materials and Product workshops to the SEs, customers, and the community.
Grafana Labs is a remote-first, open-source powerhouse whose open source visualization tool has more than 20M users. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack and thrive in an innovation-driven environment.
Partner closely with product engineering squads (embedded model)
Own production reliability for high-SLA and complex customer environments
Design and implement automation to scale our reliability practices
Grafana Labs is a remote-first, open-source powerhouse that helps more than 3,000 companies manage their observability strategies. They are scaling fast and staying true to what makes them different: an open-source legacy, a global collaborative culture, and a passion for meaningful work.
Cultivate and advocate for standard DevOps practices.
Implement standards to meet security requirements.
Provide DevOps services to alleviate the workload of development teams.
Appfire creates software that empowers teams to break silos and collaborate seamlessly. They are a remote-first company with 850+ employees across 28 countries, fostering an environment where everyone is respected.
Lead end-to-end delivery of large, cross-functional projects.
Own architecture, reliability, performance and cost for critical systems.
Grafana Labs provides an open source observability platform that integrates metrics, logs, traces, and profiles with Grafana. They have a global collaborative culture, and passion for meaningful work. Their team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything they do.
Own the entire Laboratory Operations Software release process execution, ensuring smooth and timely software releases with minimal downtime.
Act as an internal consultant and subject matter expert, coaching individual product teams on best-in-class DevOps practices.
Continuously improve and automate infrastructure provisioning, configuration management, application deployment, and testing using tools like Terraform, Kubernetes and CI/CD.
Natera is a global leader in cell-free DNA (cfDNA) testing, dedicated to oncology, women’s health, and organ health, aiming to make personalized genetic testing standard. The Natera team consists of highly statisticians, geneticists, doctors, laboratory scientists, business professionals, software engineers and many other professionals from world-class institutions, who care deeply for the work and each other.
Extend and automate the existing container orchestration platform, ensuring its scalability, reliability, and performance
Work closely with SREs from different teams to reduce their cognitive load related to the orchestration platform
Implement and maintain security best practices for the orchestration platform, ensuring the security and availability of our systems
Kraken is a mission-focused company rooted in crypto values. They aim to accelerate the global adoption of crypto, so that everyone can achieve financial freedom and inclusion. As a fully remote company, Kraken has employees in 70+ countries who speak over 50 languages.
Partner with engineers to build dev tools that empower developer workflows and deployment infrastructure.
Ensure reliability of multi-cloud Kubernetes clusters and pipelines.
Metrics, logging, analytics, and alerting for performance and security across all endpoints and applications.
Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Their platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices.
Contribute to building and operating the infrastructure that supports the HackerOne platform.
Improve the reliability, security, and scalability of our systems.
Design and operate highly available cloud systems and apply best practices for reliability, observability, and security.
HackerOne is a global leader in Continuous Threat Exposure Management (CTEM). The HackerOne Platform unites agentic AI solutions with the ingenuity of the world’s largest community of security researchers to continuously discover, validate, prioritize, and remediate exposures across code, cloud, and AI systems. They combine the ingenuity of the largest security research community with a best-in-class AI-powered platform, trusted by the world’s top organizations.
Design, deploy and maintain a cloud infrastructure to support a Dataiku SaaS offering mainly on AWS and Azure and GCP
Continuously improve the infrastructure, deployment and configuration to deliver more reliable, resilient, scalable and secure services
Automate as much as possible all technical operations
Dataiku is The Universal AI Platform™, giving organizations control over their AI talent, processes, and technologies to unleash the creation of analytics, models, and agents. They connect many data science technologies and integrate the best of data and AI tech.
Ensure we continuously deliver and improve our infrastructure platform
Guide the team through massive reduction of manual work through employing an AI-driven engineering approach. You will reach a zero-touch scalable infrastructure, where AI handles dependency management, assists with incident resolution, cloud cost optimization, and developer support - ensuring the team can handle increasing scope and scale without having to grow in size.
Quality surge: You will prepare, define and ensure execution of the strategy for our CI/CD and artifact distribution systems to scale without increasing engineering toil
Camunda is the leader in enterprise agentic automation, orchestrating complex business processes across agents, people, and systems. As a fully remote, global company, they're rewriting the rules of modern business and growing fast, looking for top talent to join their team.
Work with your team to build and roll out new features, then use the results to iterate and improve.
Drive projects from initial ideation all the way to operations once it is in the hands of customers.
Maintain critical systems, and own their reliability, performance, and availability.
Grafana Labs is a remote-first, open-source powerhouse with over 20M users. They provide observability strategies for over 3,000 companies, featuring scalable metrics, logs, and traces, and thrive in an innovation-driven environment with transparency, autonomy, and trust.
Collaborate with teams to design innovative services and features.
Develop robust tools and services to improve the image build system as it scales.
Own high-impact, deeply technical components of the Chainguard stack.
Chainguard is the secure foundation for software development and deployment. Founded by industry experts, they provide guarded open source software, built from source and updated continuously, and have built the largest library of open source software that is secure by default.
Design, operate, and continuously improve the cloud infrastructure that powers our systems using infrastructure-as-code, monitoring, and observability.
Own critical parts of the platform: identify bottlenecks, propose and implement improvements, and drive reliability and performance at scale.
Run Kubernetes in production and evolve how we operate it.
Dune is on a mission to make crypto data accessible. They’re a collaborative multi-chain analytics platform used by thousands of developers, analysts, & investors to understand the on-chain world and the frontiers of finance. They are a team of ~60 employees working together across Europe and eastern US timezones.
Extend and improve our container-based infrastructure, running in Kubernetes and continue to automate away manual processes.
Partner with your colleagues from engineering to embrace the idea of self-service tooling, making the best way the easiest way.
Contribute to resilient CI/CD pipelines/processes to make sure that our engineering team is able to deploy features faster while being compliant to regulations.
Ada envisions a world where everyone gets the healthcare they need, using AI to help people get answers faster. With a team of physicians and clinical scientists, Ada identifies those at risk and guides them to the right care to transform healthcare and ensure no one goes undiagnosed.