Source Job

20 jobs similar to Senior Software Engineer - Grafana Cloud Observability Provider

Jobs ranked by similarity.

Canada Unlimited PTO

  • Design, build, and operate distributed systems powering observability across ClickHouse Cloud.
  • Own reliability, performance, and cost-efficiency of the telemetry pipeline and storage systems.
  • Take part in on-call rotation and drive root-cause resolution and long-term fixes.

ClickHouse is a real-time analytics and data warehousing company recognized on the 2025 Forbes Cloud 100 list. With over 3,000 customers and rapid growth, the company fosters an innovative and fast-paced culture.

UK

  • Design, build, and scale backend services powering a large-scale observability platform, including telemetry ingestion, storage systems, query engines, and alerting pipelines.
  • Develop and optimize distributed systems that process logs, metrics, and traces at high volume with a strong focus on reliability and performance.
  • Collaborate with cross-functional engineering teams to improve system architecture, scalability, and developer experience.

United States 6w PTO

  • Build and operate the internal engineering platform that provides application engineers with the tools, systems, and Kubernetes clusters they need to deploy and run their workloads.
  • Focus on cloud infrastructure, capacity management, security, engineering productivity, monitoring, and US Federal compliance across squads.
  • Participate in on-call rotations to ensure the health of the system and understand how people use our products.

Grafana Labs, the company behind the open observability cloud, is founded on the principles of open source, open standards, open ecosystems, and open culture. We are a 100% remote company with 1,600+ team members across 40+ countries, backed by leading investors including Lightspeed Venture Partners, Sequoia Capital, GIC, Coatue, J.P. Morgan, CapitalG, and Lead Edge Capital.

US Canada 6w PTO

  • Earning the trust of our large-scale operator customers to further Grafana's "big tent" philosophy of data accessibility and to meet clear business objectives.
  • Designing and leading the development of backend services, distributed systems, and enterprise features at scale.
  • Driving continuous improvement of our engineering culture through words and actions.

Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana, the open source visualization tool, around the globe. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack. The Grafana team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything they do.

Germany 6w PTO

  • Anticipate and support the Solutions Engineering team by designing technical presentations, demos, and white papers.
  • Create and deliver training materials, product workshops, and webinars for internal teams and customers.
  • Partner with Product, Marketing, and Engineering to enable the field with deep technical expertise and strategic support.

Grafana Labs is the company behind the open-source observability platform, providing a fully managed cloud service for monitoring and analytics. With over 1,600 team members across 40+ countries, they foster a global collaborative culture rooted in open source, transparency, and autonomy.

Germany 6w PTO

  • Own and operate 100+ multi-cloud streaming clusters and related database infrastructure in production.
  • Diagnose and eliminate cross-layer failure modes such as object storage latency, noisy neighbors, and query performance regressions.
  • Design safe upgrade and rollout strategies at scale, improving observability, automation, and operational ergonomics.

Grafana Labs is the company behind the open observability cloud, providing a fully managed observability platform built for scale. With over 35 million users and 7,000+ customers, we are a 100% remote company of 1,600+ team members across 40+ countries, backed by leading investors.

$116,449–$139,531/yr
Europe 6w PTO

  • Take an active role in influencing our roadmap and your own career objectives.
  • Drive projects from initial ideation all the way to operations once it is in the hands of customers.
  • Design, build, operate, and maintain critical systems, owning the reliability, performance, and availability.

Grafana Labs is behind the open observability cloud, and is founded on the principles of open source, open standards, open ecosystems, and open culture. They are a 100% remote company with 1,600+ team members across 40+ countries.

UK

  • Lead the design, development and operation of large-scale, secure observability systems to keep services online and performant.
  • Deploy and scale Prometheus, ElasticSearch clusters, and high-throughput Kafka data pipelines for millions of customer devices.
  • Collaborate with the Observability team to build alerting systems, APIs, and self-service monitoring tools using Terraform and multiple languages.

ItD is a new generation consulting and software development company that blends diversity, innovation, and integrity with real business results. It is a woman- and minority-led firm with a global community, empowering employees and offering benefits like medical, dental, vision, 401(k), and career development.

UK Germany Spain Ireland Sweden 6w PTO

  • Work with your team to deliver new functionality, then use results to iterate and improve.
  • Take an active role in influencing our roadmap and your career objectives.
  • Mentor and support other team members, participate in design discussions, and collaborate with the team.

Grafana Labs is the company behind the open observability cloud, Grafana Cloud, built on open-source principles. With over 1,600 team members across 40+ countries, we foster a global collaborative culture backed by leading investors.

UK 6w PTO

  • Act as a trusted technical partner, guiding organizations through onboarding, implementation, and expansion with white-glove support and best practices.
  • Deliver high-impact training, jumpstart engagements, and provide tailored technical consulting to help customers succeed.
  • Identify recurring issues, monitor support needs, and advocate for product improvements in close collaboration with internal teams.

Grafana Labs is the company behind Grafana, the open observability platform. With over 1,600 team members across 40+ countries, we are a 100% remote company backed by leading investors and trusted by more than 35 million users and 7,000+ customers.

US

  • Design and operate enterprise-grade observability platforms across metrics, logs, traces, and events.
  • Build scalable monitoring stacks with Prometheus, Grafana, Loki, Tempo, OpenTelemetry, and Datadog.
  • Define SLOs, SLIs, error budgets, and alerting strategies to ensure system reliability.

Our partner is a technology company focused on building scalable observability platforms for distributed systems. They are an engineering-driven organization with a strong emphasis on automation, scalability, and developer experience.

  • Co-own the architecture of cloud infrastructure on Azure and Kubernetes clusters for high throughput and availability.
  • Drive resilience strategy for global scaling, zero-downtime deployments, and disaster recovery.
  • Evolve observability stack with LGTM (Loki, Grafana, Tempo, Mimir) and lead incident response.

Flip is an AI-powered employee experience platform for frontline workers in retail, manufacturing, and logistics. The company is a young, rapidly growing tech company with a remote-first culture and offices in Berlin and Stuttgart.

Global

  • Make high-quality, data-driven decisions on building the next generation of our production platform and deliver results.
  • Own how we test, build, and deploy code in a high-scale PaaS environment, collaborating across the company on design and technology choices.
  • Blaze a trail as part of a small platform engineering team, driving reliability practices and directly influencing what we work on and how we work.

Astronomer empowers data teams to bring mission-critical software, analytics, and AI to life with its unified DataOps platform Astro, built on Apache Airflow. Trusted by over 800 enterprises, the company is a leader in data orchestration and innovation.

US Unlimited PTO

  • Provide frontline technical expertise to help developers deploy and scale Temporal in cloud-native environments.
  • Troubleshoot complex infrastructure issues, optimize performance, and develop automation solutions.
  • Collaborate with engineering and product teams to influence platform improvements and enhance developer experience.

Temporal provides an open source programming model that simplifies code and makes applications more reliable. The company is a growing team driven by values of curiosity, collaboration, and humility, focused on improving developer experience.

US Canada UK Unlimited PTO 18w maternity 12w paternity

  • Own moderately complex backend features and services in Go on GCP end-to-end from design through production.
  • Write clean, tested, production-ready code and improve it continuously.
  • Contribute to infrastructure, observability tooling, and on-call preparation.

Chainguard is the trusted source for open source software, delivering hardened and secure builds to eliminate risk. The company is venture-backed by leading investors and serves Fortune 500 enterprises.

India

  • Design and deliver robust, high-scale routing experiences for Data Pipelines for Twilio Segment.
  • Operate always-available, complex distributed systems in cloud environments.
  • Collaborate cross-functionally with design, product, and other engineers to define solutions.

Twilio is shaping the future of communications, delivering innovative solutions to hundreds of thousands of businesses and empowering millions of developers worldwide. The company is remote-first with a strong culture of connection and global inclusion, and employs a diverse team of Twilions.

United States

  • Own and evolve observability strategy including monitoring, alerting, dashboards, logging, and distributed tracing.
  • Define and manage SLIs, SLOs, and reliability metrics, improving MTTD and MTTR through automation.
  • Build and maintain reliable cloud infrastructure on AWS and Kubernetes while mentoring engineers on SRE best practices.

Filevine is a Legal AI company delivering Legal Operating Intelligence for legal work. Fueled by a team of exceptional collaborators and innovators, Filevine’s rapid growth has earned AI awards and recognition from Deloitte and Inc. as one of the most innovative and fastest-growing technology companies in the country.

US Unlimited PTO

  • Design, develop, and maintain scalable, reliable, and secure software systems that support Humata's healthcare platform
  • Build and enhance backend services, APIs, workflows, and distributed systems that process healthcare data at scale
  • Collaborate with product managers, designers, clinicians, and other engineers to deliver impactful features and improvements

Humata Health creates frictionless prior authorization for providers and payers through proprietary AI and automation technology. They are a physician-led company backed by strategic healthcare investors, focusing on improving patient care and operational efficiency in the healthcare ecosystem.

EMEA Americas

  • Collaborate proactively with a globally distributed team to write, test, and document high-quality code.
  • Debug issues and interact with a vibrant community, reviewing code produced by other engineers.
  • Attend conferences to represent Canonical and the Charmed Observability Stack, with travel 2 to 4 weeks per year.

Canonical is a leading provider of open source software and operating systems, known for Ubuntu. The company is a pioneer of global distributed collaboration with 1200+ colleagues in 75+ countries, is founder-led, profitable, and growing.

EMEA

  • Design and build large-scale distributed systems and high-throughput data pipelines using Go and cloud-native technologies.
  • Lead system-wide architectural decisions focusing on data flow, performance, and resilience.
  • Champion best engineering practices, code quality, testing, and maintainability while mentoring junior engineers.

DoiT is a global technology company that helps organizations leverage the cloud for business growth, combining data, technology, and human expertise. With thousands of customers worldwide, DoiT fosters a remote-first culture that values entrepreneurship, knowledge pursuit, and fun.