Jobs Similar to Observability Engineer | TangerineFeed

Observability Engineer

Confidential 3 hours ago

US

Design and operate enterprise-grade observability platforms across metrics, logs, traces, and events.
Build scalable monitoring stacks with Prometheus, Grafana, Loki, Tempo, OpenTelemetry, and Datadog.
Define SLOs, SLIs, error budgets, and alerting strategies to ensure system reliability.

Prometheus Grafana OpenTelemetry

20 jobs similar to Observability Engineer

Jobs ranked by similarity.

Senior Field Engineer

Grafana Labs 1 day ago

Germany 6w PTO

Anticipate and support the Solutions Engineering team by designing technical presentations, demos, and white papers.
Create and deliver training materials, product workshops, and webinars for internal teams and customers.
Partner with Product, Marketing, and Engineering to enable the field with deep technical expertise and strategic support.

Grafana Labs is the company behind the open-source observability platform, providing a fully managed cloud service for monitoring and analytics. With over 1,600 team members across 40+ countries, they foster a global collaborative culture rooted in open source, transparency, and autonomy.

View details Similar jobs

Cloud Software Engineer - Observability Platform

ClickHouse 14 days ago

Canada Unlimited PTO

Design, build, and operate distributed systems powering observability across ClickHouse Cloud.
Own reliability, performance, and cost-efficiency of the telemetry pipeline and storage systems.
Take part in on-call rotation and drive root-cause resolution and long-term fixes.

ClickHouse is a real-time analytics and data warehousing company recognized on the 2025 Forbes Cloud 100 list. With over 3,000 customers and rapid growth, the company fosters an innovative and fast-paced culture.

View details Similar jobs

Senior Solutions Architect

Grafana Labs 11 days ago

UK 6w PTO

Act as a trusted technical partner, guiding organizations through onboarding, implementation, and expansion with white-glove support and best practices.
Deliver high-impact training, jumpstart engagements, and provide tailored technical consulting to help customers succeed.
Identify recurring issues, monitor support needs, and advocate for product improvements in close collaboration with internal teams.

Grafana Labs is the company behind Grafana, the open observability platform. With over 1,600 team members across 40+ countries, we are a 100% remote company backed by leading investors and trusted by more than 35 million users and 7,000+ customers.

View details Similar jobs

Staff Software Engineer - Platform, SysEng

Grafana Labs 11 days ago

United States 6w PTO

Build and operate the internal engineering platform that provides application engineers with the tools, systems, and Kubernetes clusters they need to deploy and run their workloads.
Focus on cloud infrastructure, capacity management, security, engineering productivity, monitoring, and US Federal compliance across squads.
Participate in on-call rotations to ensure the health of the system and understand how people use our products.

Grafana Labs, the company behind the open observability cloud, is founded on the principles of open source, open standards, open ecosystems, and open culture. We are a 100% remote company with 1,600+ team members across 40+ countries, backed by leading investors including Lightspeed Venture Partners, Sequoia Capital, GIC, Coatue, J.P. Morgan, CapitalG, and Lead Edge Capital.

View details Similar jobs

New Staff Backend Engineer - Grafana Enterprise

Grafana Labs 23 days ago

$174,986–$209,983/yr

US Canada 6w PTO

Earning the trust of our large-scale operator customers to further Grafana's "big tent" philosophy of data accessibility and to meet clear business objectives.
Designing and leading the development of backend services, distributed systems, and enterprise features at scale.
Driving continuous improvement of our engineering culture through words and actions.

Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana, the open source visualization tool, around the globe. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack. The Grafana team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything they do.

View details Similar jobs

Senior Backend Software Engineer (Observability)

Jobgether 16 hours ago

UK

Design, build, and scale backend services powering a large-scale observability platform, including telemetry ingestion, storage systems, query engines, and alerting pipelines.
Develop and optimize distributed systems that process logs, metrics, and traces at high volume with a strong focus on reliability and performance.
Collaborate with cross-functional engineering teams to improve system architecture, scalability, and developer experience.

View details Similar jobs

Principal Product Manager Infrastructure, Observability

Elastic 26 days ago

Global

Defining and driving the vision and strategy for Infrastructure Observability.
Identifying gaps in end to end experience, defining and owning the roadmap to fill those gaps.
Working closely across teams and across Orgs, collaborating with Engineering, UX, Design and other teams to deliver on your roadmap.

Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale — unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter.

View details Similar jobs

Senior Developer Success Engineer - West

Temporal 8 days ago

US Unlimited PTO

Provide frontline technical expertise to help developers deploy and scale Temporal in cloud-native environments.
Troubleshoot complex infrastructure issues, optimize performance, and develop automation solutions.
Collaborate with engineering and product teams to influence platform improvements and enhance developer experience.

Temporal provides an open source programming model that simplifies code and makes applications more reliable. The company is a growing team driven by values of curiosity, collaboration, and humility, focused on improving developer experience.

View details Similar jobs

Service Mesh Engineer

Jobgether 3 hours ago

United States

Design, deploy, and operate service mesh platforms (Istio and Linkerd) across multi-cluster Kubernetes environments.
Implement mTLS, certificate lifecycle automation, and workload identity propagation for secure communication.
Build and enhance observability for service-to-service communication using tracing, metrics, and topology insights.

Jobgether uses AI-powered matching to connect candidates with roles. They focus on efficient hiring processes and data privacy.

View details Similar jobs

Senior Backend Engineer - Alerting

Grafana Labs 22 days ago

$116,449–$139,531/yr

Europe 6w PTO

Take an active role in influencing our roadmap and your own career objectives.
Drive projects from initial ideation all the way to operations once it is in the hands of customers.
Design, build, operate, and maintain critical systems, owning the reliability, performance, and availability.

Grafana Labs is behind the open observability cloud, and is founded on the principles of open source, open standards, open ecosystems, and open culture. They are a 100% remote company with 1,600+ team members across 40+ countries.

View details Similar jobs

Principal DevOps Engineer

Zartis 6 days ago

Europe

Build and operate secure agent runtimes with sandboxing, runtime isolation, and RBAC.
Design and maintain integration surfaces with MCP-style adapters and gateways across marketplace teams.
Implement observability and cost control including traces, telemetry, and cost-per-workflow.

Zartis is a global AI transformation and technology consulting partner that designs, builds, and scales technology solutions for ambitious organizations. With engineering hubs across EMEA and LATAM and long-term partnerships in financial services, healthcare, and energy, they foster an inclusive culture based on trust and innovation.

View details Similar jobs

Resident Architect

Honeycomb 5 days ago

Latin America Unlimited PTO 16w maternity 16w paternity

Lead customers in strategic application of Honeycomb and observability practices to meet technical and business goals.
Act as a trusted advisor on telemetry schema design, data modeling, and sampling strategies.
Coach and mentor engineering teams on observability, SRE concepts, and instrumentation best practices.

Honeycomb defines observability for developer tools, working with companies like HelloFresh, Slack, and Vanguard. They are a fully distributed company of over 200 employees, named to Forbes' America's Best Startups in 2022 and 2023, with a culture focused on impact, inclusion, and autonomy.

View details Similar jobs

Senior Software Engineer, Platform

Jobgether 15 hours ago

United States

Design and build core platform infrastructure for large-scale cloud-native data and analytics systems.
Own and improve CI/CD pipelines, testing frameworks, and deployment in a high-scale PaaS environment.
Contribute to reliability engineering, observability, and operational excellence across distributed systems.

Jobgether uses an AI-powered matching process to connect candidates with roles. The company is a growing platform focused on efficient job matching and data privacy compliance.

View details Similar jobs

Senior Site Reliability Engineer

Attain Finance 7 hours ago

US Unlimited PTO

Build and operate the delivery platform across AWS, EKS, ArgoCD, Helm, and Terraform, fixing production problems and driving root-cause analysis.
Standardize CI/CD pipelines using GitHub Actions and Azure DevOps, implement progressive delivery with Argo Rollouts, and build observability with Grafana and Prometheus.
Support platform adoption, reduce toil and cost, unblock cross-team delivery, and write documentation to eliminate knowledge silos.

Attain Finance is a leading consumer credit lender with over 50 years of expertise providing credit solutions across the U.S. and Canada. The company employs a dynamic team that fosters innovation and collaboration, with a portfolio including brands like Cash Money, LendDirect, Heights Finance, and others.

View details Similar jobs

Site Reliability Engineer (SRE)

Supabase 1 day ago

Global

Collaborate with service teams to define SLIs and SLOs based on customer experience and build error budget policies that influence engineering decisions.
Own the Operational Readiness Review process, conducting reviews for new services and major changes across observability, alerting, runbooks, capacity, and graceful degradation.
Act as a reliability expert for architecture reviews, failure mode analysis, dependency mapping, and resilience design.

Supabase provides the Postgres development platform with a complete backend solution including Database, Auth, Storage, Edge Functions, Realtime, and Vector Search. With 280+ team members across 55+ countries, they are an open-source-first company that values async work and has raised $500M.

View details Similar jobs

Senior Site Reliability Engineer (Remote Build)

Remote 5 days ago

Global Unlimited PTO 16w maternity 16w paternity

Own the operational excellence and infrastructure strategy for Remote Build's platform, ensuring reliability, performance, and security.
Lead incident response, build observability systems, and drive continuous improvement in system reliability.
Embed security into infrastructure, optimize costs, and automate operational toil to scale efficiently.

Remote solves modern organizations' biggest challenge of navigating global employment compliantly. With a fully distributed team across 6 continents, the company fosters a future-focused culture with core values of innovation and async work.

View details Similar jobs

Senior Site Reliability Engineer

Finom 24 days ago

Europe

Design and operate our Kubernetes ecosystem with a focus on high availability and zero-downtime operations.
Own and evolve our PaaS strategy, using GitOps and CI/CD to empower domain teams to deploy independently.
Define and implement our observability strategy across metrics, logs, and tracing.

Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial services for entrepreneurs. They offer an all-in-one financial B2B solution integrating banking, accounting, financial management, and invoicing into a mobile-first platform, with about 346 million in funding.

View details Similar jobs

Senior Site Reliability Engineer

Redcare Pharmacy 22 days ago

Germany

Build and maintain end-to-end observability with ELK, Prometheus, and Grafana.
Own and improve CI/CD pipelines (CircleCI, GitLab CI, GitHub Actions, ArgoCD).
Lead incident response and postmortems in a blameless culture.

Redcare Pharmacy is Europe’s No.1 e-pharmacy, powered by passionate teams and cutting-edge innovation. They strive to create a healthy, collaborative work environment where every employee feels valued and inspired to contribute to their vision “Until every human has their health”.

View details Similar jobs

Senior Site Reliability Engineer

MZLA Technologies Corporation 12 days ago

US 5w PTO

Design and develop CI/CD systems for websites, services, and release workflows, and operate an EKS-based Kubernetes platform.
Diagnose debug production incidents, drive root-cause analysis, and implement improvements to enhance system reliability.
Write and maintain infrastructure as code using Pulumi or Terraform/OpenTofu across multiple AWS accounts with security-conscious practices.

Thunderbird is one of the world’s most trusted open-source email applications, empowering more than 20 million people globally. Our small but growing distributed team includes 65+ people across seven countries, and we build privacy-respecting communication tools with a collaborative, inclusive, and user-first spirit.

View details Similar jobs

Staff SRE, Ads

Reddit 1 day ago

Europe

Lead reliability initiatives across multiple Ads domains including ad serving, auctions, targeting, reporting, measurement, and billing.
Partner with engineering leadership to improve reliability, scalability, operational excellence, and engineering efficiency across the Ads organization.
Design and build platforms, tooling, and automation that improve reliability and developer productivity at scale.

Reddit is a community of communities, built on shared interests, passion, and trust, home to the most open and authentic conversations on the internet. With 100,000+ active communities and approximately 126 million daily active unique visitors, it is one of the internet's largest sources of information.

View details Similar jobs