Source Job

US 6w PTO

  • Build and scale a strong culture of operational excellence by defining standards and coaching teams to own reliability and availability.
  • Drive mature DevOps/SRE practices, including incident response and PIRs, on-call readiness, runbooks, alerting, observability, and release/change management.
  • Guide teams in the design, development, evolution, and operation of large-scale, distributed cloud systems.

Python Go DevOps SRE Cloud

20 jobs similar to Staff Software Engineer - Grafana Cloud k6

Jobs ranked by similarity.

$116,943–$140,233/yr
UK 6w PTO

  • Design and implement high-quality, scalable services to be consumed by multiple Grafana Cloud products.
  • Support the technical direction and vision of the team, contributing to strategic discussions and future development of observability solutions
  • Be a part of your team’s follow-the-sun on-call rotations and take ownership of the services you’re running

Grafana Labs is a remote-first, open-source powerhouse that provides the leading open source visualization tool. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack. The team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything we do.

$113,082–$175,725/yr
Canada

  • Operate and maintain large-scale data systems, ensuring stability and performance.
  • Design, implement, and optimize deployment processes using virtualization.
  • Monitor system health, analyze failures, and identify instability sources.

Jobgether is a platform that uses AI-powered matching to connect candidates with companies. They ensure applications are reviewed quickly, objectively, and fairly, then share a shortlist of top candidates directly with the hiring company.

  • Maximize the velocity of our product engineering team.
  • Ensure platform scalability, reliability, and security.
  • Champion best practices and shape the engineering culture.

They are building a robust, scalable trading platform to serve high-traffic, latency-sensitive applications. They leverage state-of-the-art technologies to support real-time trading while providing unparalleled reliability and performance.

Global

  • Partner with engineers to build dev tools that empower developer workflows and deployment infrastructure.
  • Ensure reliability of multi-cloud Kubernetes clusters and pipelines.
  • Metrics, logging, analytics, and alerting for performance and security across all endpoints and applications.

Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Their platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices.

$120,000–$180,000/yr
US

  • Develop automation code to provision and operate infrastructure at scale.
  • Build resilient, scalable, secure, and observable services with cost optimization.
  • Proactively identify and address security concerns across systems and infrastructure.

Globality uses AI to transform enterprise spending into a more efficient and inclusive process. They aim to revolutionize enterprise procurement with AI and have a culture built on trust, collaboration, and innovation, fostering an environment where every individual feels valued and included.

Europe 6w PTO

  • Partner closely with product engineering squads (embedded model)
  • Own production reliability for high-SLA and complex customer environments
  • Design and implement automation to scale our reliability practices

Grafana Labs is a remote-first, open-source powerhouse that helps more than 3,000 companies manage their observability strategies. They are scaling fast and staying true to what makes them different: an open-source legacy, a global collaborative culture, and a passion for meaningful work.

US 5w maternity

  • Support teammates with goal-setting, professional development, and mentoring.
  • Ensure delivery of maintainable, high-quality platform systems.
  • Build and sustain a healthy team culture where ownership and collaboration are the norm.

onX is a pioneer in digital outdoor navigation solutions through its suite of apps. With over 400 employees, they foster a fast-paced, tech-forward environment valuing ownership, accountability, and teamwork.

US Canada Europe Asia

  • Automate the provisioning of all of Juniper Square’s infrastructure in code.
  • Partner with our Platform Engineering team on building developer tooling / improving developer experiences via joint initiatives and enhancements.
  • Partner with our Data Engineering team on improving our data posture and driving operational excellence.

Juniper Square's mission is to unlock the full potential of private markets by digitizing them to bring efficiency, transparency, and access. They are a values-driven organization with a hybrid workplace strategy, allowing employees to collaborate effectively across multiple countries and offering physical offices in several major cities.

Europe 6w PTO

  • Design and implement high-quality, scalable integrations for various infrastructure components, applications, and data ingestion pipelines.
  • Create middleware components and libraries that simplify development and maintenance of observability solutions.
  • Lead the technical direction and vision of the team, contributing to strategic discussions and future development of observability solutions.

Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, featuring scalable metrics, logs, and traces, and thrive in an innovation-driven environment.

Americas

  • Work in Python and Golang to design and deliver open source software operations code
  • Shape high quality open source monitoring and alerting infrastructure
  • Grow a healthy, collaborative engineering culture in line with the company values

Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. As the company that publishes Ubuntu, one of the most important open-source projects and the platform for AI, IoT, and the cloud, it is changing the world of software. The company has 1200+ colleagues in 75+ countries company and has a global distributed collaboration culture.

Nigeria

  • Detect and triage service and reliability issues.
  • Develop automation to eliminate manual and repetitive operational tasks.
  • Investigate and resolve customer complaints escalated beyond L1 and L2 support.

Moniepoint is an all-in-one financial services platform for emerging markets. Since 2019, Moniepoint’s technology has powered over 3 million people, offering personal and business banking, payment, credit and business management tools to help them succeed.

LATAM

  • Monitor production systems, dashboards, logs, and alerts to ensure high availability and performance across distributed environments.
  • Assist in incident detection, triage, escalation, and resolution, following structured on-call rotations with mentorship support.
  • Maintain, follow, and continuously improve runbooks, operational procedures, and incident response workflows.

Jobgether is a platform that helps job seekers find the right opportunities. They use an AI-powered matching process to ensure applications are reviewed quickly and fairly.

$150,000–$167,000/yr
US

  • Lead reliability-focused design and readiness reviews.
  • Build, operate, and continuously improve our observability stack.
  • Own and evolve incident management practices.

Transcend is building the privacy platform that easily embeds privacy into your entire tech stack. They are growing quickly, backed by top-tier investors and are proud to serve some of the world's most iconic brands.

North America Europe

  • Build distributed systems that support reliability, resiliency, and safe operation at scale.
  • Design and operate traffic control mechanisms: circuit breakers, rate limiting, admission control, backpressure, and graceful degradation.
  • Develop tooling that improves incident detection, response, and automated mitigation.

Whatnot is the largest live shopping platform in North America and Europe to buy, sell, and discover the things you love. They are a remote co-located team, inspired by innovation and anchored in their values.

$130,906–$157,374/yr
EMEA 6w PTO

  • Maintain the Field Engineering infrastructure, including the pre-sales Demo Kit application and infrastructure.
  • Design, develop, and deliver compelling product demos to add to the demo kit library.
  • Create and deliver Training Materials and Product workshops to the SEs, customers, and the community.

Grafana Labs is a remote-first, open-source powerhouse whose open source visualization tool has more than 20M users. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack and thrive in an innovation-driven environment.

Europe Middle East Africa

  • Design, deploy and maintain a cloud infrastructure to support a Dataiku SaaS offering mainly on AWS and Azure and GCP
  • Continuously improve the infrastructure, deployment and configuration to deliver more reliable, resilient, scalable and secure services
  • Automate as much as possible all technical operations

Dataiku is The Universal AI Platform™, giving organizations control over their AI talent, processes, and technologies to unleash the creation of analytics, models, and agents. They connect many data science technologies and integrate the best of data and AI tech.

Europe 6w PTO

  • Drive technical strategy and roadmap.
  • Lead end-to-end delivery of large, cross-functional projects.
  • Own architecture, reliability, performance and cost for critical systems.

Grafana Labs provides an open source observability platform that integrates metrics, logs, traces, and profiles with Grafana. They have a global collaborative culture, and passion for meaningful work. Their team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything they do.

Europe

  • Own the reliability, scalability, and performance of Peec AI’s core systems and infrastructure
  • Design, build, and maintain the tooling, automation, and monitoring that keep our services fast, secure, and highly available
  • Partner closely with product and engineering teams to ensure new features are reliable, observable, and easy to operate from day one

Peec AI is one of Europe’s fastest-growing Series A startups (no employee count/culture details given). They provide exciting and challenging work in the AI space.

Mexico

  • Collaborate with engineers in supporting new features and services.
  • Build tools to monitor site stability and performance.
  • Troubleshoot site issues using industry-leading tools like Splunk, Prometheus and OpenTelemetry.

Yelp's engineering culture is cooperative and values individual authenticity. They encourage creative solutions to problems and help users, grow as engineers, and have fun in a collaborative environment.

US 6w PTO

  • Operate and evolve multi-cloud streaming clusters and related database infrastructure, diagnosing and eliminating cross-layer failure modes.
  • Design safe upgrade and rollout strategies at scale, improving observability, automation, and operational ergonomics.
  • Partner closely with database and platform teams to ensure safe scaling, partitioning, consumer fan-out, and query performance.

Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack.