Jobs Similar to Site Reliability Engineer | TangerineFeed

Site Reliability Engineer

Redis 5 hours ago

$161,637–$175,000/yr

US

Handle technical escalations and engage in complex troubleshooting within a Follow-the-Sun (FTS) support model.
Develop automation frameworks and regression test suites (Python, Bash) to streamline deployment and testing processes.
Troubleshoot and manage incidents for production systems (cloud infrastructure, TCP/IP networking) and work with NoSQL databases (Redis).

Python Bash Grafana Prometheus Git

20 jobs similar to Site Reliability Engineer

Jobs ranked by similarity.

Site Reliability Engineer

Granicus 3 days ago

Global

Provide production support on a shift according to the team on-call roster.
Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support.
Continuously monitor the health and performance of our services, systems, and infrastructure.

Granicus is driven by the excitement of building, implementing, and maintaining technology that is transforming the Govtech industry by bringing governments and its constituents together. They have served 5,500 federal, state, and local government agencies and more than 300 million citizen subscribers.

View details Similar jobs

Site Reliability Engineer

Ivanti 22 days ago

US

Deploy, manage, and secure Ivanti’s production Software-as-a-Service (SaaS) environments in AWS and Azure
Automate common and repetitive tasks
Participate in on-call rotations for 24x7 coverage (follow-the-sun model) for incident response, issue triage, and problem resolution

Ivanti's mission is to elevate human potential within organizations by managing, protecting and automating technology for continuous innovation. They are committed to building a diverse team and fostering an inclusive environment where everyone belongs.

View details Similar jobs

Site Reliability Engineer

Newton 13 days ago

Canada

Implementing the improvements to the reliability, fault tolerance, scalability, and performance of our infrastructure
Managing incidents using your technical know-how to involve the appropriate teams and automate away manual practices
Improving observability across our systems (metrics, logs, tracing) to reduce time to detection and resolution

Newton is changing how Canadians trade crypto with the goal to make financial freedom achievable for everyone by giving their customers the tools and knowledge needed to navigate the crypto world. They are a remote team spread across Canada that values pushing boundaries and getting things done.

View details Similar jobs

Site Reliability Engineer, Production Reliability

Yelp 3 days ago

$135,000–$185,000/yr

Canada

Working with engineers across Yelp in supporting new features and services.
Integrating tools to monitor platform stability and performance.
Help scale our Kubernetes clusters and AWS-based infrastructure while maintaining our platform's SLOs.

Yelp's engineering culture values individual authenticity and encourages creative solutions. They focus on helping users, growing as engineers, and having fun in a collaborative environment.

View details Similar jobs

Senior Site Reliability Engineer

Kraken 2 days ago

Americas

Manage and support infrastructure for Growth teams, including Nomad, Hashistack, databases, and any other underlying systems
Maintain and troubleshoot GitLab CI pipelines, ensuring reliable and fast build, test, and deployment cycles
Provide operational support across Onboarding, Acquire, and Engage teams, helping debug issues in staging and production environments

Kraken is a mission-focused company rooted in crypto values, aiming to accelerate the global adoption of crypto, so that everyone can achieve financial freedom and inclusion. As a fully remote company, they have Krakenites in 70+ countries who speak over 50 languages.

View details Similar jobs

Senior Site Reliability Engineer- Remote

ClickHouse 28 days ago

$141,000–$230,000/yr

US

Collaborate with engineering teams to design and implement scalable, secure systems.
Establish and manage service level objectives (SLOs) and service level agreements (SLAs).
Enhance incident response processes and post-mortem analysis for outages.

ClickHouse, recognized on the 2025 Forbes Cloud 100 list, is one of the most innovative and fast-growing private cloud companies. With more than 3,000 customers and ARR that has grown over 250 percent year over year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads.

View details Similar jobs

Site Reliability Engineer

Arista Networks 15 days ago

Europe

Design, build, and deploy production systems with a focus on scalability and security.
Develop and maintain comprehensive automation solutions to streamline operational efficiency.
Proactively monitor systems, establish alerting strategies, and implement automated incident response.

Arista Networks is a data-driven, client-to-cloud networking company for large data center, campus, and routing environments. They have over $8 billion in revenue and value diversity of thought and perspectives, fostering an inclusive environment for creativity and innovation.

View details Similar jobs

DevOps Data bricks Engineer

Breezy 29 days ago

Europe

Strong Cloud Engineering expertise, primarily Azure.
Proficiency with Databricks platform administration.
Experience working in Agile/Scrum or SAFe environments.

I don't have enough information to provide a company description.

View details Similar jobs

Site Reliability Engineer (f/m/n)

InPost Group 21 days ago

Europe

Write code, automate everything, design for reliability, and deeply understand the systems.
Build or extend Terraform modules and contribute to Platform Engineering around Observability.
Collaborate with developers to shape feature design so that reliability is built in, not added later.

InPost Group is an innovative European out of home deliveries company, revolutionizing the way parcels are delivered to customers. With over 10,000 employees worldwide, InPost Group is one of the largest out of home delivery providers in Europe, committed to providing sustainable and efficient delivery solutions.

View details Similar jobs

Site Reliability Engineer

Ooma 24 days ago

$110,000–$175,000/yr

US

Become a subject matter expert in applications supporting Ooma customers.
Collaborate with Development, QA and other SREs to evaluate, deploy, and debug applications.
Improve observability by implementing, refining, and adjusting application monitoring and thresholds.

Ooma empowers people to connect in smarter ways by creating powerful communication experiences through their cloud-based platform. They help small business owners stay connected, provide customized unified communications solutions, and offer smart home security solutions.

View details Similar jobs

Staff Software Engineer - Grafana Cloud Observability, Kubernetes Monitoring

Jobgether 29 days ago

$174,986–$209,983/yr

US 6w PTO

Design, implement, and maintain scalable integrations for metrics, logs, and traces across cloud and Kubernetes environments.
Build middleware, libraries, and services to simplify development and observability workflows.
Lead technical direction and strategic planning for observability projects.

They are currently looking for a Staff Software Engineer - Grafana Cloud Observability, Kubernetes Monitoring in United States. This role offers a unique opportunity to shape and advance cloud observability solutions for large-scale systems, focusing on metrics, logs, and traces.

View details Similar jobs

Service Reliability Engineer

Mambu 13 days ago

Lead and resolve technically deep Level 2 support cases from initial triage to full root cause analysis and final fix.
Diagnose issues across distributed, cloud-native systems, with emphasis on application and API behaviour.
Perform code-level debugging (Python, Go, or Java) to pinpoint application defects or misconfigurations.

Mambu is a leading SaaS cloud banking platform. They are on a mission to make banking better for a billion people and shape the future of financial services.

View details Similar jobs

Staff Site Reliability Engineer

SmarterDx 27 days ago

$230,000–$250,000/yr

US Unlimited PTO 12w paternity

Define and evolve reliability standards for the SmarterDx platform.
Enhance observability systems (metrics, logs, traces, alerting) to provide actionable insights and reduce mean time to detect (MTTD) and resolve (MTTR).
Reduce operational toil through automation, self-healing systems, and improved deployment and rollback mechanisms.

SmarterDx, a Smarter Technologies company, builds clinical AI that is transforming how hospitals translate care into payment. Founded by physicians in 2020, their platform connects clinical context with revenue intelligence, helping health systems recover millions in missed revenue, improve quality scores, and appeal every denial.

View details Similar jobs

Senior Cloud Application Support Engineer

Atmosera 21 days ago

Latin America

Execute expert-level real-time monitoring and incident dispositioning for critical client applications.
Correlate complex data across metrics, traces, and logs to perform deep-dive root cause analysis.
Lead the triage of complex alerting environments to filter noise and ensure that high-priority incidents are managed.

Atmosera empowers businesses to redefine what's possible with modern technology and human expertise. They enable organizations to accelerate innovation, enhance security, and optimize operational agility as a Microsoft Partner.

View details Similar jobs

Database Reliability Engineer

Wavelo 2 days ago

$92,295–$102,541/yr

Canada

Design, implement, and operate highly available PostgreSQL clusters.
Optimize query performance and indexing strategies.
Build and maintain automation for deployment tasks.

Wavelo provides flexible software that modernizes how communication service providers (CSPs) do business, helping them drive more value, focus on customer experience, and scale their operations faster. As part of Tucows, Wavelo is backed by outstanding resources and talent, embracing a people-first philosophy rooted in respect, trust, and flexibility.

View details Similar jobs

Sr. Site Reliability Engineer, Security

CentralReach 28 days ago

$160,000–$180,000/yr

US

Responsible for availability, latency, performance, efficiency, monitoring/observability, emergency response, capacity planning.
Analyze, troubleshoot and resolve operational challenges contributing to defined SLO's.
Manage site stability, performance, reliability, and maintain uptime for production environments.

CentralReach provides autism and IDD care software for Applied Behavior Analysis (ABA), multidisciplinary therapy, and special education. They are trusted by more than 200,000 users and is backed by Roper Technologies, Inc. (Nasdaq: ROP). Their culture is centered around impact, inclusion, and flexibility.

View details Similar jobs

Senior AI-Enabled DevOps Engineer

PointClickCare 7 days ago

$134,000–$149,000/yr

US

Design, implement, and operate cloud-native infrastructure for production workloads.

PointClickCare's mission is to help providers deliver exceptional care. They are a leading health tech company that’s founder-led and privately held that empowers their employees to push boundaries, innovate, and shape the future of healthcare. They have the largest long-term and post-acute care dataset and a Marketplace of 400+ integrated partners, their platform serves over 30,000 provider organizations.

View details Similar jobs

Site Reliability Engineer

Upsun 30 days ago

Europe Unlimited PTO

Enhance system monitoring with tools like Prometheus, Grafana, and ELK Stack, ensuring visibility and alignment with business objectives.
Transition manual processes to automated solutions using IaC tools (e.g., Terraform, Ansible) to streamline deployments and improve operational efficiency.
Improve pipeline architecture for fast, reliable releases, ensuring scalability and resilience to handle high volumes of changes.

Upsun (formerly Platform.sh) is a cloud application platform designed for hybrid teams, enabling developers, DevOps engineers, and platform teams to build, ship, and scale confidently without backend infrastructure hassles. Upsunners are a remote, global workforce committed to open source and an open, welcoming environment, valuing curiosity, knowledge, and innovative ideas.

View details Similar jobs

Site Reliability Engineer (SRE)

Arthur Grand 27 days ago

Mexico

Strong SRE professional for a Remote position.
Deep expertise in reliability engineering and automation.
Experience with cloud platforms is a great opportunity.

Arthur Grand is an IT services firm specializing in Digital Transformation initiatives for Federal, Commercial, State & local customers. With a culture of delivery excellence and a commitment to bringing the best talent, they have earned an unparalleled reputation for delivering transformative results.

View details Similar jobs

Network Operations Engineer

Polygon Labs 23 days ago

Global Unlimited PTO

Monitor the health and performance of Polygon Labs’ infrastructure, including blockchain networks, bridges, RPC services, staking systems, and user-facing products
Track third-party dependencies and identify degradation that may impact the broader ecosystem
Validate and triage alerts by distinguishing signal from noise, assessing severity, and determining impact

Polygon Labs is a global blockchain payments company building and operating infrastructure to move money instantly, reliably, and at internet scale, with the mission to move all money onchain. Its infrastructure has facilitated trillions of dollars in onchain value transfer and supported millions of transactions daily.

View details Similar jobs