Design, build, and operate reconciliation systems to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration.
Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient.
Improve operational efficiency by reducing deployment complexity and contributing to the Stack Config Reconciliation project.
Design, build, and operate reconciliation systems to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration.
Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient.
Improve operational efficiency by reducing deployment complexity and contributing to the Stack Config Reconciliation project.
Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana around the globe. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack. Their team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything they do.
Earning the trust of our large-scale operator customers to further Grafana's "big tent" philosophy of data accessibility and to meet clear business objectives.
Designing and leading the development of backend services, distributed systems, and enterprise features at scale.
Driving continuous improvement of our engineering culture through words and actions.
Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana, the open source visualization tool, around the globe. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack. The Grafana team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything they do.
Manage and grow a distributed team of engineers, providing feedback and supporting career development.
Partner with product management to shape the Usage squad's roadmap, ensuring alignment with company mission and customer impact.
Guide the team through the full project lifecycle, ensuring high-quality and timely outcomes within the Usage domain.
Grafana Labs is a remote-first, open-source powerhouse with over 20M users globally. Their team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything they do.
Take an active role in influencing our roadmap and your own career objectives.
Drive projects from initial ideation all the way to operations once it is in the hands of customers.
Design, build, operate, and maintain critical systems, owning the reliability, performance, and availability.
Grafana Labs is behind the open observability cloud, and is founded on the principles of open source, open standards, open ecosystems, and open culture. They are a 100% remote company with 1,600+ team members across 40+ countries.
Take an active role in influencing our roadmap and your own career objectives.
Design, build, operate, and maintain critical systems, owning the reliability, performance, and availability.
Mentor and support other team members, participate in design discussions and collaborate with the team.
Grafana Labs is a remote-first, open-source powerhouse that provides visualization tools. They help companies manage their observability strategies. Grafana Labs has a global collaborative culture, and a passion for meaningful work.
Designing, building, and maintaining scalable applications across the stack.
Contributing to high-performance backend services.
Working on complex frontend experiences and data-processing workflows.
Coderio designs and delivers scalable digital solutions for global companies. With a strong technical foundation and a product-oriented mindset, its teams lead complex software projects from architecture to execution.
Help define architecture to support millions of daily API requests, build and scale infrastructure for sending tens of millions of emails daily, and improve high availability across distributed applications.
Scale databases like Postgres and Clickhouse for performance, enhance observability using tools like Datadog, and refine disaster recovery plans for quick and reliable service recovery.
Build infrastructure with IaC frameworks like CDK and TF, work with Typescript and Golang, and design and operate async pipelines handling tens of millions of messages daily with on-call rotation for critical services.
Resend is building the modern email sending platform for developers, focusing on quality, craft, and developer experience. It is a fully remote team of about 40 people spanning 11 countries, backed by investors like a16z and Y Combinator, and values honesty, low ego, and autonomy.
Manage, hire, and develop a team of engineers, providing regular feedback.
Act as project manager and work with product owners to ensure the product roadmap is up-to-date.
Engage in technical conversations and challenge teams to arrive at strong technical decisions.
Grafana Labs is a remote-first, open-source powerhouse that provides visualization tools and helps companies manage their observability strategies. We value transparency, autonomy, and trust.
Architect future iterations of core systems, addressing scaling requirements.
Design and implement developer tools to enhance deployment safety and reproducibility.
Drive excellence in monitoring and guide incident response for quick issue resolution.
Found provides tools for self-employed individuals, offering a business bank account that automates taxes and expense tracking. They aim to give self-employed people the security and peace of mind historically available only at large corporations and are looking for kind, resourceful, and passionate people.
Build and scale a strong culture of operational excellence by defining standards and coaching teams to own reliability and availability.
Drive mature DevOps/SRE practices, including incident response and PIRs, on-call readiness, runbooks, alerting, observability, and release/change management.
Guide teams in the design, development, evolution, and operation of large-scale, distributed cloud systems.
Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana around the globe. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, and their team thrives in an innovation-driven environment.
Own the technical direction of Remote's SRE/Platform domain.
Define and drive the reliability strategy across the platform.
Identify and lead AI enablement initiatives across the engineering organisation.
Remote is solving modern organizations’ biggest challenge – navigating global employment compliantly with ease. With our core values at heart and a future-focused work culture, our team works tirelessly on ambitious problems, asynchronously, around the world.
Design and operate our Kubernetes ecosystem with a focus on high availability and zero-downtime operations.
Own and evolve our PaaS strategy, using GitOps and CI/CD to empower domain teams to deploy independently.
Define and implement our observability strategy across metrics, logs, and tracing.
Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial services for entrepreneurs. They offer an all-in-one financial B2B solution integrating banking, accounting, financial management, and invoicing into a mobile-first platform, with about 346 million in funding.
Design, develop, and deliver high-quality backend services and APIs primarily using Go (Golang) and deploy them in Kubernetes environments.
Build and maintain automated tests for backend services, participate in code reviews, and monitor service performance in production to debug issues.
Provide technical guidance on backend architecture and integration challenges, sharing knowledge and supporting continuous improvement of processes and documentation.
Applied Systems provides innovative software and services for the insurance industry. They are an established insurtech company with 40+ years of experience and focus on creating a collaborative, value-driven culture for their team.
Drive the stability and reliability of Epic's GCP infrastructure.
Manage and harden our Docker and GKE container platform.
Maintain and improve CI/CD pipelines.
Epic is the leading digital reading platform for kids ages 12 and under, used by millions of children, families, and educators around the world. As Epic continues to grow, we are reimagining what reading can be through thoughtful technology, data, and global collaboration to make learning more engaging, accessible, and impactful.
Participate in the development and maintenance of high-performance backend services and applications using Golang.
Architect, implement, and optimize microservices-based applications, ensuring scalability, reliability, and maintainability.
Collaborate with the DevOps team to deploy and manage Golang applications in Kubernetes clusters using Helm.
Ruby Labs creates and operates innovative consumer products across health, education, and entertainment. They foster innovation and look for passionate individuals to join their fast-growing teams.
Develop and maintain features as part of Observability solutions in Grafana Cloud.
Contribute to the design and implementation of high-quality, scalable integrations for various infrastructure components, databases, and applications
Build prototypes and present your ideas as part of a cross-functional team
Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, and thrive in an innovation-driven environment with a global collaborative culture.
Design, build, and maintain scalable cloud infrastructure services in AWS and GCP.
Contribute production-quality Go and Python code to existing cloud services.
Develop and own automation and software deployment pipelines with maximum efficiency.
Dragos is dedicated to arming customers with best-in-class technology, threat intelligence, and services to protect their systems. They embody core values of authenticity, transparency, and trust and are a remote-first culture with operations in North America, Europe, the Middle East, and APAC.
Design, build, and maintain scalable, reliable systems on GCP.
Develop automation for infrastructure provisioning using Terraform, Ansible, or Deployment Manager.
Manage incident response, conduct postmortems, and implement improvements to reduce recurrence.
SupplyHouse.com is an industry-leading e-commerce company specializing in HVAC, plumbing, heating, and electrical supplies since 2004. They value every individual team member and cultivate a community where people come first with Generosity, Respect, Innovation, Teamwork, and GRIT.
Build internal tooling to help other engineers and the rest of the company understand and operate our system.
Design and implement security best practices for our team and infrastructure.
Reduce toil through automation, including building and maintaining CI/CD infrastructure.
Openly is rebuilding insurance from the ground up by re-envisioning and enhancing every aspect of the customer experience. They are a rapidly growing team of exceptional, curious, empathetic people with a wide range of skill sets, spanning many departments.
Design, build, and maintain infrastructure using Infrastructure as Code tools such as Terraform.
Improve system reliability, scalability, resilience, and performance across the Mast platform.
Build systems and tooling that automate infrastructure management and operational workflows wherever possible.
Mast is on a mission to make complex lending simple by building modern, cloud-native lending technology purpose-built for specialist lenders. It is a high-performance team of engineers and lending experts that values radical honesty, transparency, and speed.