Design, build, and operate reconciliation systems to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration.
Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient.
Improve operational efficiency by reducing deployment complexity and contributing to the Stack Config Reconciliation project.
Grafana Labs is a remote-first, open-source powerhouse with over 20M users of Grafana. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, featuring scalable metrics (Grafana Mimir), logs (Grafana Loki), and traces (Grafana Tempo).
Own and operate end-to-end infrastructure for backend services, frontend systems and databases.
Build and maintain reliable deployment workflows including CI/CD pipelines and rollback procedures.
Improve system-wide observability through metrics, logging, alerting, and monitoring to ensure uptime.
Jito Labs builds a high-performance trading terminal on Solana. They are a lean, high-output team building something that sits at the intersection of execution quality, user experience, and on-chain infrastructure.
Lead architecture, system design and engineering efforts for high-scale, data-intensive B2B systems.
Design and implement batch and real-time processing architectures that are reliable, observable, and performant.
Mentor and coach engineers at all levels, and actively contribute to Omada’s engineering community.
Omada Health is a digital care provider that empowers people to achieve their health goals through sustainable behavioral change. They have served more than two million members and strive to build an inclusive culture where differences are celebrated.
Design, build, and operate reconciliation systems to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration.
Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient.
Improve operational efficiency by reducing deployment complexity and contributing to the Stack Config Reconciliation project.
Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana around the globe. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack. Their team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything they do.
Earning the trust of our large-scale operator customers to further Grafana's "big tent" philosophy of data accessibility and to meet clear business objectives.
Designing and leading the development of backend services, distributed systems, and enterprise features at scale.
Driving continuous improvement of our engineering culture through words and actions.
Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana, the open source visualization tool, around the globe. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack. The Grafana team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything they do.
Drive major technical initiatives from design through production, improving scalability, reliability, and correctness across critical systems.
Design and evolve backend services, APIs, event-driven workflows, and data models that support complex business processes at scale.
Improve the operational foundations of the platform through better observability, testing, deployment safety, and incident reduction.
Tem is rebuilding the energy transaction, making it transparent and fair. They aim to put power back in the hands of customers and tackle the critical problem of access to low-cost electricity, leveraging AI-driven infrastructure for efficient and sustainable energy markets.
Architect and deliver end-to-end systems, from internal developer tooling and platform services to scalable infrastructure components.
Improve and extend core platforms and developer experience tooling, making it easier for teams to build, test, deploy, and operate quality software.
Rapidly prototype and iterate on new workflows, automation, and platform capabilities that improve engineering productivity and operational excellence.
Tem is rebuilding the energy transaction, making it transparent and fair. After extraordinary growth, they closed a $75 million Series B and are positioning themselves for global expansion, deeper product innovation and category leadership.
Improve the reliability, performance, and scalability of our production platform.
Operate reliable infrastructure, improve observability, and drive incident response.
Use data-driven reliability practices such as SLIs, SLOs, SLAs, and DORA metrics.
VRChat is a game-changing platform that provides an endless collection of social VR experiences. They empower their community to bring their imaginations to life and help shape the metaverse. Their team includes people from Netflix, Twitter, Meta, and Microsoft.
Lead architectural design and technical discovery for complex, distributed systems across our platform.
Define and evolve system boundaries, service interactions, and data flow within our event-driven ecosystem.
Guide the design of scalable, fault-tolerant systems leveraging asynchronous communication patterns (e.g., RabbitMQ, Kafka, SNS/SQS).
Fanatics is building a leading global digital sports platform that ignites the passions of global sports fans. We offer products and services across Fanatics Commerce, Fanatics Collectibles, and Fanatics Betting & Gaming. Our more than 22,000 employees are committed to relentlessly enhancing the fan experience and delighting sports fans globally.
Build internal tooling to help other engineers and the rest of the company understand and operate our system.
Design and implement security best practices for our team and infrastructure.
Reduce toil through automation, including building and maintaining CI/CD infrastructure.
Openly is rebuilding insurance from the ground up by re-envisioning and enhancing every aspect of the customer experience. They are a rapidly growing team of exceptional, curious, empathetic people with a wide range of skill sets, spanning many departments.
Help define architecture to support millions of daily API requests, build and scale infrastructure for sending tens of millions of emails daily, and improve high availability across distributed applications.
Scale databases like Postgres and Clickhouse for performance, enhance observability using tools like Datadog, and refine disaster recovery plans for quick and reliable service recovery.
Build infrastructure with IaC frameworks like CDK and TF, work with Typescript and Golang, and design and operate async pipelines handling tens of millions of messages daily with on-call rotation for critical services.
Resend is building the modern email sending platform for developers, focusing on quality, craft, and developer experience. It is a fully remote team of about 40 people spanning 11 countries, backed by investors like a16z and Y Combinator, and values honesty, low ego, and autonomy.
Wynd builds infrastructure that delivers massive amounts of web data to the companies training the world’s most powerful AI models. They are a lean, technical team that moves fast and expands what’s possible for open web data and AI.
Design, build, and operate core cloud infrastructure across compute, storage, databases, and networking layers.
Own and improve the reliability, scalability, and security of Valon’s production systems as we scale to support major enterprise deployments.
Evaluate, adopt, and operationalize new infrastructure technologies (e.g., Vitess, Clickhouse, Redis) to meet evolving product and scale requirements.
Valon is building the AI-native operating system for regulated finance, starting with mortgage servicing. They're a Series C company backed by a16z, transforming industries that others have written off as too complex to innovate.
Own the technical direction of Remote's SRE/Platform domain.
Define and drive the reliability strategy across the platform.
Identify and lead AI enablement initiatives across the engineering organisation.
Remote is solving modern organizations’ biggest challenge – navigating global employment compliantly with ease. With our core values at heart and a future-focused work culture, our team works tirelessly on ambitious problems, asynchronously, around the world.
Design, build, and maintain infrastructure using Infrastructure as Code tools such as Terraform.
Improve system reliability, scalability, resilience, and performance across the Mast platform.
Build systems and tooling that automate infrastructure management and operational workflows wherever possible.
Mast is on a mission to make complex lending simple by building modern, cloud-native lending technology purpose-built for specialist lenders. It is a high-performance team of engineers and lending experts that values radical honesty, transparency, and speed.
Design, build, and maintain scalable, reliable systems on GCP.
Develop automation for infrastructure provisioning using Terraform, Ansible, or Deployment Manager.
Manage incident response, conduct postmortems, and implement improvements to reduce recurrence.
SupplyHouse.com is an industry-leading e-commerce company specializing in HVAC, plumbing, heating, and electrical supplies since 2004. They value every individual team member and cultivate a community where people come first with Generosity, Respect, Innovation, Teamwork, and GRIT.
Improving architecture across backend services and platform infrastructure, and defining long-term architectural strategy and technical standards across teams.
Establish engineering standards that improve consistency, maintainability, reliability, and operational readiness.
Mentor Senior and Staff engineers through architecture reviews, technical coaching, and project guidance.
Truffle Security is a cybersecurity company on a mission to make secrets easier to detect, verify, and remediate across modern software environments. The company's enterprise solution gives security and engineering teams everything they need to find exposed credentials. They are trusted by organizations including NVIDIA, Chick-fil-A, and OpenAI.
Set technical strategy for your team on a year-long scale and tie it to business-impacting projects.
Collaborate across product management, design, and analytics to ensure technical sustainability and manage risks.
Foster a culture of quality and ownership by setting code review standards and developing team talent.
Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest. They are a remote-first company that provides competitive benefits anchored to their core value of people coming first.
Design and build scalable infrastructure to support rapid growth in data volume, service usage, and engineering velocity.
Implement and maintain core security infrastructure and controls including, service-to-service authentication, secrets management, application security primitives.
Partner closely with Security Engineering to implement infrastructure that supports best-in-class security and compliance practices.
Vanta helps businesses earn and prove trust by providing a platform that continuously monitors and verifies security. They empower companies to practice better security and prove it with ease. Vanta has a kind and talented team with offices in SF, NYC, London, Dublin, Tel Aviv, and Sydney.
Lead the design, implementation, and ongoing improvement of reliable, scalable, performant, and secure production platforms and services.
Work closely with cross-functional teams to build and maintain resilient infrastructure and deployment patterns.
Provide technical leadership and mentorship to engineers across the organisation, promoting strong engineering standards and operational best practice.
Cision empowers individuals to make an impact and values diverse perspectives. They foster curiosity, collaboration, and innovation while driving meaningful contributions to brands; they have offices in 24 countries throughout the Americas, EMEA and APAC.