Source Job

20 jobs similar to Senior Software Engineer - Grafana Databases, Managed Services

Jobs ranked by similarity.

Europe 6w PTO

  • Operate and evolve multi-cloud streaming clusters and related database infrastructure, diagnosing and eliminating cross-layer failure modes.
  • Define and evolve the technical direction for operating shared database systems at scale, leading complex initiatives and reliability investments.
  • Mentor and support engineers, improve systems toil with automation, and partner with database and platform teams to align on strategy.

Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, featuring scalable metrics, logs, and traces and thrive in an innovation-driven environment where transparency, autonomy, and trust fuel everything.

Global

  • Design and implement infrastructure and tools that empower our product teams to rapidly and securely iterate, emphasizing reliability and automation.
  • Influence the strategic direction of our infrastructure and operational practices, ensuring that we are well-positioned to scale and support our growing organization.
  • Take a proactive role in the resolution of production issues, ensuring that we are well-prepared to handle incidents and that we learn from them in a blameless manner.

SSV Labs is the core team behind the SSV Network - pioneering decentralized infrastructure for Ethereum staking. They are building tools, protocols, and standards to make staking more secure, scalable, and trustless.

US Canada

  • Own SLI/SLO/SLA definitions for the Akuity SaaS platform and drive continuous improvement.
  • Participate in an on-call rotation and act as incident commander for high-severity production events.
  • Partner with engineering teams to build reliability into new features before they ship to production

Akuity helps enterprises ship software faster and more reliably with modern GitOps best practices. The Akuity Platform enables teams to manage the development and deployment across hundreds – if not thousands – of Kubernetes clusters from a single control plane.

  • Designing, building, and operating Kubernetes infrastructure across multiple cloud providers.
  • Building and maintaining automation for cluster lifecycle management, node provisioning, and provider onboarding.
  • Developing platform tooling and abstractions that enable other Canva engineers to deploy and scale workloads.

Canva is a design platform redefining how the world experiences design. They have campuses in Sydney and Melbourne, along with co-working spaces in Brisbane, Perth and Adelaide, offering a flexible and inclusive work environment.

Europe

  • Design, build, and maintain scalable, highly available and fault-tolerant infrastructures.
  • Implement and improve monitoring, alerting, and incident response systems to ensure optimal system performance and minimize downtime.
  • Drive continuous improvement in infrastructure automation, deployment, and orchestration.

Mistral AI is dedicated to democratizing AI through high-performance, optimized, open-source models, products, and solutions designed to integrate seamlessly into daily working life. They are a dynamic, collaborative team passionate about AI and its potential to transform society dedicated to innovation.

Europe 5w PTO

  • Work with other Engineering teams to design sustainable infrastructure and microservice solutions.
  • Automate tools and infrastructure to reduce manual work.
  • Monitor applications and participate in an on-call rotation as required.

Bloomreach is building the world’s premier agentic platform for personalization, revolutionizing how businesses connect with their customers by building and deploying AI agents to personalize the entire customer journey. They power personalization for more than 1,400 global brands.

$165,000–$195,000/yr
US

  • Support and operate Legion’s AWS-based cloud platform and Kubernetes (EKS) environments.
  • Build and maintain infrastructure-as-code using Terraform.
  • Improve CI/CD pipelines to increase deployment safety and velocity.

Legion Technologies delivers the industry’s most innovative workforce management platform. The AI-driven Legion WFM platform maximizes labor efficiency and employee engagement. They are a remote, mission-driven team that embraces a collaborative, fast-paced, and entrepreneurial culture.

$170,000–$240,000/yr
US 4w PTO

  • Own our fundamental cloud services and tooling.
  • Own our application platform.
  • Own our developer experience.

Propel builds technology that strengthens the social safety net. They are a passionate team of ~100 Propellers who envision a future where every American has the tools and resources they need to thrive, offering a remote-first working environment with headquarters in Brooklyn.

Europe

  • Lead the Infrastructure Engineering team, taking full ownership of cloud infrastructure, Kubernetes platforms, DevOps tooling, and CI/CD pipelines.
  • Drive reliability, scalability, and security across the production environment while maintaining a sharp focus on developer velocity and business impact.
  • Mentor and guide engineers across SRE, DevOps, and Database Reliability functions, fostering a culture of operational excellence and pragmatic problem-solving.

Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial services for entrepreneurs with an all-in-one B2B platform. They have raised $346 million, are expanding across key EU markets, and foster innovation, prioritizing research and solutions that benefit users, employees, partners, and the business.

Unlimited PTO

  • Develop and maintain observability solutions using platforms like Datadog, Prometheus and Grafana
  • Take a leading role in incident management, including coordinating response efforts, troubleshooting issues, and identifying follow-up actions
  • Partner with product engineering teams to architect reliable systems, recover from incidents, and learn from mistakes

Ditto is redefining how data moves at the edge, aiming to make resilient, real-time applications seamless for developers, regardless of network conditions. It's a globally distributed and fast-growing startup with over $145 million in funding that is committed to building a diverse and inclusive team.

Americas

  • Manage and support infrastructure for Growth teams, including Nomad, Hashistack, databases, and any other underlying systems
  • Maintain and troubleshoot GitLab CI pipelines, ensuring reliable and fast build, test, and deployment cycles
  • Provide operational support across Onboarding, Acquire, and Engage teams, helping debug issues in staging and production environments

Kraken is a mission-focused company rooted in crypto values, aiming to accelerate the global adoption of crypto, so that everyone can achieve financial freedom and inclusion. As a fully remote company, they have Krakenites in 70+ countries who speak over 50 languages.

Global

  • Deliver a scalable internal infrastructure platform on public cloud environments.
  • Establish and evolve Kubernetes-based platform capabilities to support high-availability, production-grade workloads at scale.
  • Build a secure and reliable foundation that supports CI/CD pipelines and minimizes operational risk across engineering teams

Chainlink is the industry-standard oracle platform bringing the capital markets onchain and powering the majority of decentralized finance (DeFi). Since inventing decentralized oracle networks, Chainlink has enabled tens of trillions in transaction value and now secures the vast majority of DeFi.

US Canada

  • Build platforms that scale; Design and operate foundational infrastructure that handle billions of events and enable company to grow with minimal friction.
  • Enable product velocity; Create tooling that let engineers ship faster and more reliably without becoming infrastructure experts themselves.
  • Drive technical direction; Shape Metronome's infrastructure strategy, make platform-level architectural decisions, and mentor engineers across the organization.

Metronome is the leading usage-based billing platform built for modern software companies. They compute millions of invoices per billing period and are scaling rapidly to accommodate new customers, saving them hours of development time and manual invoicing. They've raised over $128M from leading investors including NEA, Andreessen Horowitz, General Catalyst, Elad Gil, and Workday Ventures.

Global

  • Leading a team focused on designing, building, and evolving cloud-native, containerized infrastructure.
  • Driving complex technical initiatives and ensuring the availability, security, scalability, and reliability of our data ecosystem.
  • Guiding and developing engineering talent, setting priorities, driving execution, and partnering across teams.

Pismo, founded in 2016, provides a comprehensive processing platform for banking, card issuing and financial market infrastructure. Pismo has 500+ employees located in more than 10 countries around the world and was acquired by Visa in 2024.

US

  • Collaborate with application engineering teams on platform infrastructure.
  • Enhance observability and spearhead the adoption of SRE best practices.
  • Build and maintain reliable CI/CD pipelines, tooling, and infrastructure.

Rula strives to provide quality, evidence-based, compassionate mental healthcare and aims to create a world where mental health is no longer stigmatized. They are a remote-first company operating in most U.S. states, and are dedicated to having a culture of inclusion that supports their employees.

Global

  • Build and own the foundational infrastructure that our products run upon.
  • Work directly on our products' golang code base to implement SRE related objectives.
  • Take a data driven approach to quantifying system performance and reliability.

LiveKit provides the network infrastructure for multimodal AI interfaces, enabling seamless audio and visual interactions. Founded in 2021, LiveKit supports over 3 Billion calls annually, with 100,000+ developers and industry giants like OpenAI, Spotify, and Meta.

$198,025–$287,952/yr

  • Building tools and applications to extends Calendly’s infrastructure platform
  • Evaluating and deploying cloud native open source tools
  • Exercising expertise in cloud infrastructure concepts and patterns

Calendly's product powers connections for millions through impactful innovation. They are in the midst of exciting growth and desire people that want to learn, grow, and do their best work.

US

  • Design, build, and operate core cloud infrastructure across compute, storage, databases, and networking layers.
  • Own and improve the reliability, scalability, and security of Valon’s production systems as we scale to support major enterprise deployments.
  • Evaluate, adopt, and operationalize new infrastructure technologies (e.g., Vitess, Clickhouse, Redis) to meet evolving product and scale requirements.

Valon is building the AI-native operating system for regulated finance, starting with mortgage servicing. They are a Series C company backed by a16z, transforming industries that others have written off as too complex to innovate.

  • Work directly with enterprise customers to deploy and configure OpenTelemetry instrumentation across their environments.
  • Build custom integrations, dashboards, and tooling to help customers realize the full value of Dash0.
  • Troubleshoot complex issues in distributed systems, Kubernetes clusters, and observability pipelines.

Dash0 is building an AI-centric platform that eliminates vendor lock-in and meaningless toil and is OpenTelemetry-native. They are backed by top-tier investors including Balderton Capital, Accel and Cherry Ventures and led by a founding team with decades of experience in observability.

US EMEA

  • Design and implement the complex distributed infrastructure that powers our core AI engine and distributed analysis systems.
  • Tune and optimize cloud services across compute, storage, networking, and observability to drive performance and reliability.
  • Develop our core services, written in TypeScript, Kotlin and Go to support our unique deployment and infrastructure requirements.

XBOW is building the future of offensive security. They create the platform that puts security ahead in the arms race, using AI to autonomously discover, validate, and exploit vulnerabilities. Founded by Oege de Moor, the company is backed by Sequoia, Altimeter, and other leading investors.