Jobs Similar to Site Reliability Engineer

Senior DevOps & Platform Engineer

About Us 15 days ago

Maximize the velocity of our product engineering team.
Ensure platform scalability, reliability, and security.
Champion best practices and shape the engineering culture.

They are building a robust, scalable trading platform to serve high-traffic, latency-sensitive applications. They leverage state-of-the-art technologies to support real-time trading while providing unparalleled reliability and performance.

View details Similar jobs

Site Reliability Engineer II

InvestorFlow 11 days ago

Global

Design and implement comprehensive monitoring strategies.
Take ownership of production incident response, lead handling, and drive remediation.
Continuously improve operational processes, reliability practices, and team readiness.

InvestorFlow delivers industry specialized CRM and digital portals to help alternative asset firms find opportunities, create and manage relationships, and turn relationship insights into action. They serve over 175 clients, including 25 of the top 50 alternative asset managers, managing more than $6 trillion in assets.

View details Similar jobs

Senior Site Reliability Engineer

Fixify 9 days ago

Europe

Design and maintain scalable, fault-tolerant infrastructure that supports our SaaS platform and keeps pace with business growth.
Define, document, and maintain SLIs, SLOs, and SLAs in partnership with product engineering, translating business commitments into technical guardrails.
Lead incident response with steady judgment, facilitate blameless postmortems, and drive remediation efforts that prevent recurrence.

Fixify is on a mission to reimagine IT teams support companies. They need a Senior Site Reliability Engineer who finds joy in building systems that fade into the background, empowering product engineers to ship with confidence and their customers to work without interruption.

View details Similar jobs

Site Reliability Engineering (SRE) Intern

AWP Safety 6 days ago

$30–$34/hr

US

Help deploy and configure Dynatrace OneAgent and ActiveGates with automated tooling.
Define and instrument user‑centric metrics and objectives in Dynatrace.
Combine Davis® AI with Copilot/Claude to identify root causes and reduce MTTR.

AWP Safety's IT Internship Program is a hands‑on, learning experience for early‑career professionals who want to build a future in IT Site Reliability Engineering. They operate at the intersection of Software Engineering and Systems Operations, using Dynatrace to diagnose performance bottlenecks and automate "toil" out of existence.

View details Similar jobs

Infrastructure Engineer

Attune 6 days ago

$120,000–$140,000/yr

US Unlimited PTO

Architect and manage scalable cloud infrastructure within AWS.
Implement and maintain infrastructure using Terraform.
Develop automation scripts to improve operational efficiency.

Attune empowers insurance agents with their technology solutions. We foster a remote-first culture and value employee development.

View details Similar jobs

Senior Site Reliability Engineer

Pismo 7 days ago

Global

Own the end-to-end lifecycle (design, provisioning, upgrades, and decommissioning) of core platform components.
Lead the design and implementation of infrastructure bootstrap orchestration, including: Automated cluster and environment provisioning.
Apply and promote SRE practices across the platform, including: Clear ownership and runbooks for platform components.

Pismo provides a comprehensive processing platform for banking, card issuing and financial market infrastructure and helps customers innovate and build the next generation of banking and payment solutions. Pismo’s 500+ employees are located in more than 10 countries around the world.

View details Similar jobs

Senior AWS DevOps Engineer (Remote, Poland) Contract

Nearform 20 days ago

Europe

Developing infrastructure to support cloud-based applications.
Creating deployment architect and continuous delivery pipelines.
Designing high-availability approaches, and implementing monitoring architecture.

Nearform is a digital and AI engineering consultancy with a reputation for experience-led modernization. They focus on creating transformative digital products for enterprise customers across the UK and Ireland. Nearformers form a close-knit community built on trust and camaraderie.

View details Similar jobs

Site Reliability Engineer

OnePay 10 days ago

US Unlimited PTO

Design, build, and maintain scalable infrastructure and tooling that improves reliability, performance, and availability across OnePay’s platform
Contribute to the evolution of our observability stack, platform libraries, cloud architecture, and CI/CD pipelines
Develop automation and monitoring systems to detect, prevent, and remediate incidents before they impact customers

OnePay is a consumer fintech company trusted by millions of Americans to make money better, providing an all-in-one financial services platform. Backed by Walmart and Ribbit Capital, OnePay provides banking, savings, credit cards, lending, investing, and crypto services and embedded financial services to frontline workers.

View details Similar jobs

Sr Site Reliability Engineer

Dataiku 29 days ago

Europe Middle East Africa

Design, deploy and maintain a cloud infrastructure to support a Dataiku SaaS offering mainly on AWS and Azure and GCP
Continuously improve the infrastructure, deployment and configuration to deliver more reliable, resilient, scalable and secure services
Automate as much as possible all technical operations

Dataiku is The Universal AI Platform™, giving organizations control over their AI talent, processes, and technologies to unleash the creation of analytics, models, and agents. They connect many data science technologies and integrate the best of data and AI tech.

View details Similar jobs

DevOps Engineer

Deepslate 6 days ago

Europe

Design, build, and manage our cloud infrastructure using modern tools (Pulumi) to ensure all infrastructure changes are reproducible, secure, and easily auditable.
Orchestrate and optimize our Kubernetes clusters for complex, compute-heavy AI workloads, guaranteeing maximum efficiency and fault tolerance.
Implement a flawless monitoring setup using Datadog and OpenTelemetry to make the black box of our distributed systems transparent, hunting down latency spikes or bottlenecks before they impact users.

Deepslate is building Speech to Speech Voice AI models that sound and act indistinguishable from a human, with the belief that everyone should be able to use it. Backed by top-tier investors from the Tech and AI sectors, we are incredibly well-funded and moving fast.

View details Similar jobs

Staff Site Reliability Engineer, DevOps

Pismo 13 days ago

Global

Lead the implementation and optimization of CI/CD pipelines.
Develop and maintain Infrastructure as Code (IaC) scripts to automate infrastructure provisioning and management.
Identify and implement automation opportunities to improve efficiency and reduce manual effort.

Pismo, founded in 2016, provides a comprehensive processing platform for banking, card issuing, and financial market infrastructure, helping customers innovate and build next-generation banking and payment solutions. Pismo joined Visa in 2024 and has over 500 employees in more than 10 countries.

View details Similar jobs

Sr Site Reliability Engineer

Pismo 8 days ago

South America

Own the end‑to‑end lifecycle of core platform components, including cloud infrastructure primitives and Kubernetes clusters.
Design platform components to be resilient by default, applying SRE principles like fault isolation and capacity planning.
Drive Infrastructure‑as‑Code and GitOps‑first practices to ensure platform components are reproducible and auditable.

Pismo, founded in 2016, provides a comprehensive processing platform for banking, card issuing, and financial market infrastructure, helping customers innovate in banking and payments. With over 500 employees across 10+ countries, Pismo joined Visa in 2024, leveraging Visa’s solutions to advance financial technology.

View details Similar jobs

Site Reliability Engineer

Jobgether 29 days ago

LATAM

Monitor production systems, dashboards, logs, and alerts to ensure high availability and performance across distributed environments.
Assist in incident detection, triage, escalation, and resolution, following structured on-call rotations with mentorship support.
Maintain, follow, and continuously improve runbooks, operational procedures, and incident response workflows.

Jobgether is a platform that helps job seekers find the right opportunities. They use an AI-powered matching process to ensure applications are reviewed quickly and fairly.

View details Similar jobs

Sr. DevOps Engineer

Jobgether 13 days ago

Europe

Implement SLI/SLO frameworks with error budgets to drive reliability decisions
Design release strategies including blue/green deployments and version tracking
Lead incident response and develop automated runbooks to reduce MTTR

Jobgether is a company that helps connect individuals with jobs through an AI-powered matching process. They ensure applications are reviewed quickly, objectively, and fairly against roles' core requirements.

View details Similar jobs

Staff Software Engineer - Grafana Cloud k6

Grafana Labs 15 days ago

$174,986–$209,983/yr

US 6w PTO

Build and scale a strong culture of operational excellence by defining standards and coaching teams to own reliability and availability.
Drive mature DevOps/SRE practices, including incident response and PIRs, on-call readiness, runbooks, alerting, observability, and release/change management.
Guide teams in the design, development, evolution, and operation of large-scale, distributed cloud systems.

Grafana Labs is a remote-first, open-source powerhouse with more than 20M users of Grafana, the open source visualization tool, around the globe. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack.

View details Similar jobs

Senior SRE DevOps Engineer

Jobgether 12 days ago

Canada

Designing and implementing SLI/SLO frameworks with error budgets to guide reliability and performance decisions.
Building and maintaining AWS-based production infrastructure using Infrastructure as Code (Terraform, CloudFormation), including ECS, EKS/Kubernetes, and microservices orchestration.
Developing internal tools, automation frameworks, and reliability services in TypeScript, Python, or similar languages to enhance operational efficiency.

Jobgether uses an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. They identify the top-fitting candidates, and this shortlist is then shared directly with the hiring company.

View details Similar jobs

Sr. Site Reliability Engineer, Security

CentralReach 8 hours ago

$160,000–$180,000/yr

US

Responsible for availability, latency, performance, efficiency, monitoring/observability, emergency response, capacity planning.
Analyze, troubleshoot and resolve operational challenges contributing to defined SLO's.
Manage site stability, performance, reliability, and maintain uptime for production environments.

CentralReach provides autism and IDD care software for Applied Behavior Analysis (ABA), multidisciplinary therapy, and special education. They are trusted by more than 200,000 users and is backed by Roper Technologies, Inc. (Nasdaq: ROP). Their culture is centered around impact, inclusion, and flexibility.

View details Similar jobs

Sr. Site Reliability Engineer

Globality 24 days ago

$120,000–$180,000/yr

US

Develop automation code to provision and operate infrastructure at scale.
Build resilient, scalable, secure, and observable services with cost optimization.
Proactively identify and address security concerns across systems and infrastructure.

Globality uses AI to transform enterprise spending into a more efficient and inclusive process. They aim to revolutionize enterprise procurement with AI and have a culture built on trust, collaboration, and innovation, fostering an environment where every individual feels valued and included.

View details Similar jobs

Senior Infrastructure Engineer/SRE

Cresta 11 days ago

$205,000–$270,000/yr

US Unlimited PTO

Partner with engineers to build dev tools that empower developer workflows and deployment infrastructure.
Ensure reliability of multi-cloud Kubernetes clusters and pipelines.
Focus on automation so we can spend energy where it matters.

Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Their platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices.

View details Similar jobs

Senior DevOps Engineer

CodeRoad 8 days ago

US

Design and maintain scalable cloud environments using tools like Terraform, CloudFormation, or Ansible.
Build and optimize automated deployment pipelines to ensure rapid and reliable software delivery.
Implement robust monitoring, logging, and alerting frameworks to ensure 24/7 system health.

CodeRoad offers end-to-end software development services, helping businesses scale with infrastructure solutions. They provide staff augmentation, dedicated IT teams, and software engineering to empower businesses in a digital landscape.

View details Similar jobs

Source Job