Design and operate enterprise-grade observability platforms across metrics, logs, traces, and events.
Build scalable monitoring stacks with Prometheus, Grafana, Loki, Tempo, OpenTelemetry, and Datadog.
Define SLOs, SLIs, error budgets, and alerting strategies to ensure system reliability.
Our partner is a technology company focused on building scalable observability platforms for distributed systems. They are an engineering-driven organization with a strong emphasis on automation, scalability, and developer experience.
Anticipate and support the Solutions Engineering team by designing technical presentations, demos, and white papers.
Create and deliver training materials, product workshops, and webinars for internal teams and customers.
Partner with Product, Marketing, and Engineering to enable the field with deep technical expertise and strategic support.
Grafana Labs is the company behind the open-source observability platform, providing a fully managed cloud service for monitoring and analytics. With over 1,600 team members across 40+ countries, they foster a global collaborative culture rooted in open source, transparency, and autonomy.
Lead client discovery, architecture workshops, and solution design across observability, telemetry, reliability, and operational intelligence initiatives.
Define scalable standards for telemetry onboarding, naming, tagging, RBAC, service ownership, dashboards, alert governance, runbooks, and operational handoff.
AHEAD builds platforms for digital business by weaving together cloud infrastructure, automation, analytics, and software delivery to help enterprises achieve digital transformation. The company prioritizes a culture of belonging and is an equal opportunity employer that values diversity and inclusion.
Act as a trusted technical partner, guiding organizations through onboarding, implementation, and expansion with white-glove support and best practices.
Deliver high-impact training, jumpstart engagements, and provide tailored technical consulting to help customers succeed.
Identify recurring issues, monitor support needs, and advocate for product improvements in close collaboration with internal teams.
Grafana Labs is the company behind Grafana, the open observability platform. With over 1,600 team members across 40+ countries, we are a 100% remote company backed by leading investors and trusted by more than 35 million users and 7,000+ customers.
Design, build, and operate distributed systems powering observability across ClickHouse Cloud.
Own reliability, performance, and cost-efficiency of the telemetry pipeline and storage systems.
Take part in on-call rotation and drive root-cause resolution and long-term fixes.
ClickHouse is a real-time analytics and data warehousing company recognized on the 2025 Forbes Cloud 100 list. With over 3,000 customers and rapid growth, the company fosters an innovative and fast-paced culture.
Own and evolve observability strategy including monitoring, alerting, dashboards, logging, and distributed tracing.
Define and manage SLIs, SLOs, and reliability metrics, improving MTTD and MTTR through automation.
Build and maintain reliable cloud infrastructure on AWS and Kubernetes while mentoring engineers on SRE best practices.
Filevine is a Legal AI company delivering Legal Operating Intelligence for legal work. Fueled by a team of exceptional collaborators and innovators, Filevine’s rapid growth has earned AI awards and recognition from Deloitte and Inc. as one of the most innovative and fastest-growing technology companies in the country.
Design and implement high-quality, scalable integrations for observability solutions.
Collaborate with cross-functional teams to deliver features aligned with product strategy.
Participate in on-call rotations and contribute to open-source communities.
Grafana Labs provides an open-source observability platform, Grafana Cloud, that integrates metrics, logs, and traces. With over 1,600 team members across 40+ countries, they maintain a remote-first, collaborative culture backed by leading investors.
Design and maintain Grafana dashboards and telemetry visualizations to monitor system performance and platform health.
Develop and maintain modular Ansible playbooks to automate infrastructure provisioning and configuration.
Configure observability solutions with Prometheus monitoring and alerting, and participate in Agile ceremonies.
Miratech is a global IT services and consulting company that helps visionaries change the world by supporting digital transformation for large enterprises. With nearly 1,000 full-time professionals across 5 continents and 25 countries, the company has a culture of Relentless Performance with a 99% project success rate and over 25% annual growth.
Lead the design, development and operation of large-scale, secure observability systems to keep services online and performant.
Deploy and scale Prometheus, ElasticSearch clusters, and high-throughput Kafka data pipelines for millions of customer devices.
Collaborate with the Observability team to build alerting systems, APIs, and self-service monitoring tools using Terraform and multiple languages.
ItD is a new generation consulting and software development company that blends diversity, innovation, and integrity with real business results. It is a woman- and minority-led firm with a global community, empowering employees and offering benefits like medical, dental, vision, 401(k), and career development.
Build and operate the internal engineering platform that provides application engineers with the tools, systems, and Kubernetes clusters they need to deploy and run their workloads.
Focus on cloud infrastructure, capacity management, security, engineering productivity, monitoring, and US Federal compliance across squads.
Participate in on-call rotations to ensure the health of the system and understand how people use our products.
Grafana Labs, the company behind the open observability cloud, is founded on the principles of open source, open standards, open ecosystems, and open culture. We are a 100% remote company with 1,600+ team members across 40+ countries, backed by leading investors including Lightspeed Venture Partners, Sequoia Capital, GIC, Coatue, J.P. Morgan, CapitalG, and Lead Edge Capital.
Latin America
Unlimited PTO
16w maternity
16w paternity
Lead customers in strategic application of Honeycomb and observability practices to meet technical and business goals.
Act as a trusted advisor on telemetry schema design, data modeling, and sampling strategies.
Coach and mentor engineering teams on observability, SRE concepts, and instrumentation best practices.
Honeycomb defines observability for developer tools, working with companies like HelloFresh, Slack, and Vanguard. They are a fully distributed company of over 200 employees, named to Forbes' America's Best Startups in 2022 and 2023, with a culture focused on impact, inclusion, and autonomy.
Co-own the architecture of cloud infrastructure on Azure and Kubernetes clusters for high throughput and availability.
Drive resilience strategy for global scaling, zero-downtime deployments, and disaster recovery.
Evolve observability stack with LGTM (Loki, Grafana, Tempo, Mimir) and lead incident response.
Flip is an AI-powered employee experience platform for frontline workers in retail, manufacturing, and logistics. The company is a young, rapidly growing tech company with a remote-first culture and offices in Berlin and Stuttgart.
Build and lead a high-performance product engineering team focused on innovation, accountability, and reliability.
Develop scalable reliability, risk management, and operational governance capabilities for production systems.
Drive alignment across Platform Engineering, SRE, Infrastructure, and product teams to deliver long-term technical roadmap outcomes.
Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without hidden fees or compounding interest. It is a publicly traded, remote-first company with competitive benefits and a culture focused on innovation and people.
Own and operate 100+ multi-cloud streaming clusters and related database infrastructure in production.
Diagnose and eliminate cross-layer failure modes such as object storage latency, noisy neighbors, and query performance regressions.
Design safe upgrade and rollout strategies at scale, improving observability, automation, and operational ergonomics.
Grafana Labs is the company behind the open observability cloud, providing a fully managed observability platform built for scale. With over 35 million users and 7,000+ customers, we are a 100% remote company of 1,600+ team members across 40+ countries, backed by leading investors.