Latest Remote Prometheus Devops Jobs (5+)

Senior Site Reliability Engineer

Finom 17 days ago

Design and operate our Kubernetes ecosystem with a focus on high availability and zero-downtime operations.
Own and evolve our PaaS strategy, using GitOps and CI/CD to empower domain teams to deploy independently.
Define and implement our observability strategy across metrics, logs, and tracing.

Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial services for entrepreneurs. They offer an all-in-one financial B2B solution integrating banking, accounting, financial management, and invoicing into a mobile-first platform, with about 346 million in funding.

View details Similar jobs

Staff Site Reliability Engineer, Database

Alpaca 25 days ago

North America

Triage difficult technical problems and implement solutions
Improve our observability stack (monitoring, logging, profiling)
Incident Management: Respond to and resolve incidents in a timely manner, conducting post-incident reviews to identify and implement improvements.

Alpaca is a self-clearing broker-dealer and brokerage infrastructure for stocks, ETFs, options, crypto, fixed income, 24/5 trading, and more. They are a dynamic team of 380+ globally distributed members.

View details Similar jobs

Infrastructure Engineer (Observability)

Lightning AI 26 days ago

$180,000–$200,000/yr

US

Own and evolve a scalable observability platform spanning metrics, logs, traces, and events.
Design telemetry pipelines ingesting data from GPUs, CPUs, networking, containers, APIs, and BMC/Redfish.
Design and implement noise-resistant alerting systems to improve signal quality and reduce operational load.

Lightning AI builds an end-to-end platform for developing, training, and deploying AI systems, designed to take ideas from research to production with less friction. They combine developer-first software with cost-efficient, large-scale compute, serving solo researchers, startups, and large enterprises.

View details Similar jobs

Staff Software Engineer

Found 27 days ago

$210,000–$278,000/yr

US Unlimited PTO

Architect future iterations of core systems, addressing scaling requirements.
Design and implement developer tools to enhance deployment safety and reproducibility.
Drive excellence in monitoring and guide incident response for quick issue resolution.

Found provides tools for self-employed individuals, offering a business bank account that automates taxes and expense tracking. They aim to give self-employed people the security and peace of mind historically available only at large corporations and are looking for kind, resourceful, and passionate people.

View details Similar jobs

Senior Site Reliability Engineer - Ireland

Arista Networks 28 days ago

Design, build, and deploy production systems with a focus on scalability, reliability, observability, and performance.
Develop and maintain comprehensive automation solutions to eliminate toil and streamline operational efficiency.
Proactively monitor production systems and implement automated incident response mechanisms to minimise downtime.

Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. The company is well-established and profitable with over $8 billion in revenue and values diversity and inclusivity.

View details Similar jobs

Remote Devops Jobs · Prometheus

Job listings

Senior Site Reliability Engineer

Staff Site Reliability Engineer, Database

Infrastructure Engineer (Observability)

Staff Software Engineer

Senior Site Reliability Engineer - Ireland