Build and operate the internal engineering platform that provides application engineers with the tools, systems, and Kubernetes clusters they need to deploy and run their workloads.
Focus on cloud infrastructure, capacity management, security, engineering productivity, monitoring, and US Federal compliance across squads.
Participate in on-call rotations to ensure the health of the system and understand how people use our products.
Grafana Labs, the company behind the open observability cloud, is founded on the principles of open source, open standards, open ecosystems, and open culture. We are a 100% remote company with 1,600+ team members across 40+ countries, backed by leading investors including Lightspeed Venture Partners, Sequoia Capital, GIC, Coatue, J.P. Morgan, CapitalG, and Lead Edge Capital.
Work with your team to deliver new functionality, then use results to iterate and improve.
Take an active role in influencing our roadmap and your career objectives.
Mentor and support other team members, participate in design discussions, and collaborate with the team.
Grafana Labs is the company behind the open observability cloud, Grafana Cloud, built on open-source principles. With over 1,600 team members across 40+ countries, we foster a global collaborative culture backed by leading investors.
Own and operate 100+ multi-cloud streaming clusters and related database infrastructure in production.
Diagnose and eliminate cross-layer failure modes such as object storage latency, noisy neighbors, and query performance regressions.
Design safe upgrade and rollout strategies at scale, improving observability, automation, and operational ergonomics.
Grafana Labs is the company behind the open observability cloud, providing a fully managed observability platform built for scale. With over 35 million users and 7,000+ customers, we are a 100% remote company of 1,600+ team members across 40+ countries, backed by leading investors.
Design, build, and operate distributed systems powering observability across ClickHouse Cloud.
Own reliability, performance, and cost-efficiency of the telemetry pipeline and storage systems.
Take part in on-call rotation and drive root-cause resolution and long-term fixes.
ClickHouse is a real-time analytics and data warehousing company recognized on the 2025 Forbes Cloud 100 list. With over 3,000 customers and rapid growth, the company fosters an innovative and fast-paced culture.
Design and implement high-quality, scalable integrations for observability solutions.
Collaborate with cross-functional teams to deliver features aligned with product strategy.
Participate in on-call rotations and contribute to open-source communities.
Grafana Labs provides an open-source observability platform, Grafana Cloud, that integrates metrics, logs, and traces. With over 1,600 team members across 40+ countries, they maintain a remote-first, collaborative culture backed by leading investors.
Act as a trusted technical partner, guiding organizations through onboarding, implementation, and expansion with white-glove support and best practices.
Deliver high-impact training, jumpstart engagements, and provide tailored technical consulting to help customers succeed.
Identify recurring issues, monitor support needs, and advocate for product improvements in close collaboration with internal teams.
Grafana Labs is the company behind Grafana, the open observability platform. With over 1,600 team members across 40+ countries, we are a 100% remote company backed by leading investors and trusted by more than 35 million users and 7,000+ customers.
Define and own the technical architecture for the Identity squad’s authentication systems, including edge authentication redesign.
Lead the design and implementation of large-scale, multi-quarter initiatives spanning multiple services and teams.
Mentor and develop senior engineers, providing technical guidance and support for their professional growth.
Grafana Labs is the company behind the open observability cloud, providing a fully managed platform for monitoring and analyzing data. With over 1,600 team members across 40+ countries, we are a 100% remote company fostering a collaborative, open-source culture.
Define and lead the end-to-end observability strategy covering logging, metrics, tracing, and alerting.
Architect and evolve a unified observability platform ensuring scalability and reliability.
Build and lead a high-performing observability engineering team with strong technical standards.
The company operates a high-scale developer-facing platform focused on reliability and performance. It is a remote-first organization with a globally distributed engineering team committed to building best-in-class developer infrastructure.
Anticipate and support the Solutions Engineering team by designing technical presentations, demos, and white papers.
Create and deliver training materials, product workshops, and webinars for internal teams and customers.
Partner with Product, Marketing, and Engineering to enable the field with deep technical expertise and strategic support.
Grafana Labs is the company behind the open-source observability platform, providing a fully managed cloud service for monitoring and analytics. With over 1,600 team members across 40+ countries, they foster a global collaborative culture rooted in open source, transparency, and autonomy.
Partner with product engineering squads to own production reliability for high-SLA customer environments, designing automation and defining per-tenant SLOs.
Serve as a primary escalation point for incidents, leading response, post-incident reviews, and reducing SLO burn to prevent repeats.
Influence feature design for scalability and operability, improve alert quality, and eliminate toil through automation.
Grafana Labs is the company behind the open observability cloud, providing a fully managed observability platform for organizations to see, understand, and act on their data. With over 35 million users, 7,000+ customers, and 1,600+ team members across 40+ countries, we foster a remote, collaborative culture rooted in open-source values.
Design and operate enterprise-grade observability platforms across metrics, logs, traces, and events.
Build scalable monitoring stacks with Prometheus, Grafana, Loki, Tempo, OpenTelemetry, and Datadog.
Define SLOs, SLIs, error budgets, and alerting strategies to ensure system reliability.
Our partner is a technology company focused on building scalable observability platforms for distributed systems. They are an engineering-driven organization with a strong emphasis on automation, scalability, and developer experience.
Partner with the Sales team to articulate the Grafana value proposition and own technical engagements with customers.
Deliver product presentations, technical evaluations, and enablement to drive customer success and close opportunities.
Collaborate with internal teams to enhance documentation, blog posts, and provide feedback on products and the competitive landscape.
Grafana Labs is the company behind the open observability cloud, providing a fully managed observability platform built for scale. With over 1,600 team members across 40+ countries and 7,000+ customers, we are a 100% remote company backed by leading investors.
Lead the design, development and operation of large-scale, secure observability systems to keep services online and performant.
Deploy and scale Prometheus, ElasticSearch clusters, and high-throughput Kafka data pipelines for millions of customer devices.
Collaborate with the Observability team to build alerting systems, APIs, and self-service monitoring tools using Terraform and multiple languages.
ItD is a new generation consulting and software development company that blends diversity, innovation, and integrity with real business results. It is a woman- and minority-led firm with a global community, empowering employees and offering benefits like medical, dental, vision, 401(k), and career development.
Design and build large-scale distributed systems and high-throughput data pipelines using Go and cloud-native technologies.
Lead system-wide architectural decisions focusing on data flow, performance, and resilience.
Champion best engineering practices, code quality, testing, and maintainability while mentoring junior engineers.
DoiT is a global technology company that helps organizations leverage the cloud for business growth, combining data, technology, and human expertise. With thousands of customers worldwide, DoiT fosters a remote-first culture that values entrepreneurship, knowledge pursuit, and fun.
Set technical direction for the Athena clearing house, making architectural calls on data validation pipelines and workflow orchestration.
Scale the team and product area, driving transition from rapid prototyping to sustainable, production-grade product stack.
Lead design of systems processing unstructured vulnerability reports, deduplicating findings, and surfacing clean signals to remediation teams.
Chainguard secures the open source software supply chain, delivering hardened, production-ready builds of open source software. They are venture-backed by leading investors and serve Fortune 500 enterprises and global industry leaders.
Own end-to-end domain within the clearing house: customer onboarding, entitlements, or data validation.
Drive architecture and implementation of backend systems in Go on GCP, ensuring production readiness.
Establish engineering best practices and collaborate with principal engineer on technical planning.
Chainguard secures the open source software supply chain by providing hardened, secure builds of open source software. It is a venture-backed startup with a remote-first culture, trusted by Fortune 500 enterprises.
Lead technical conversations with developers and architects to understand systems and business objectives.
Build tailored demos and proof-of-concepts that showcase Temporal's solutions for customer problems.
Deliver enablement sessions and create reusable technical assets to accelerate the sales cycle.
Temporal is an open source programming model that simplifies code and makes applications more reliable. It is a growing, values-driven team with a culture of curiosity, collaboration, and humility.
Collaborate proactively with a globally distributed team to write, test, and document high-quality code.
Debug issues and interact with a vibrant community, reviewing code produced by other engineers.
Attend conferences to represent Canonical and the Charmed Observability Stack, with travel 2 to 4 weeks per year.
Canonical is a leading provider of open source software and operating systems, known for Ubuntu. The company is a pioneer of global distributed collaboration with 1200+ colleagues in 75+ countries, is founder-led, profitable, and growing.
Design and build the control plane for provisioning, scaling, and maintaining Postgres clusters.
Develop high availability, disaster recovery, and data protection mechanisms for production systems.
Build automation for database operations and contribute to distributed, fault-tolerant systems.
PlanetScale builds a next-generation managed database platform powering mission-critical applications at global scale. They are a remote-first engineering team with a collaborative culture focused on technical excellence and knowledge sharing.