Weβre looking for a talented Data Engineer, Cloud Platform to join our growing team! Youβll lead transformative cloud projects, implementing everything-as-code and DevSecOps best practices to enhance efficiency and security. Youβll collaborate with stakeholders, drive mission-critical initiatives, and deliver impactful solutions aligned with client goals and regulations.
Job listings
In this role, the Site Reliability Engineer (SRE) will be responsible for managing and resolving the most challenging issues for the ServiceNow SRE team, focusing on instance performance, reliability, and availability. This is a swing shift role (4 days a week) and the candidate must be located within the Republic of Ireland.
Play a key role in scaling and supporting H1's cloud infrastructure. Work closely with engineering and data teams to improve the reliability, visibility, and efficiency of our systems and deployment pipelines. This is a hands-on role focused on automation, enablement, and operational excellence in a fast-paced AWS-based environment.
This opportunity involves designing and driving robust, automated solutions that optimize CI/CD pipelines and cloud infrastructure utilizing tools like GitLab, AWS CloudFormation, SAM templates, CDK, and Terraform. This role helps provide teams with tools that enable consistent, high-quality software delivery through reliable and secure infrastructure management. You will lead the execution of infrastructure strategies.
As a Senior DevOps Engineer, you will continuously improve our development operations and support the reliability and availability of all our applications and services deployed to the cloud. Partner with various engineering teams to own and manage availability, latency, performance, reliability and scalability of all services to maintain SLAs that our customers expect from us. Provide strong technical leadership and people management to the team.
Design, deploy, and operate large-scale distributed systems across compute, storage, networking, and AI/ML environments. Lead projects from architecture to automation to intelligent monitoring, collaborating with both clients and teammates to build resilient, high-performing infrastructure. You'll operate and optimize Kubernetes clusters, Istio service mesh, and Linux-based systems, automating workflows using Go, Python, and Shell scripting.
Join Granicus as a Site Reliability Engineer! You will be pivotal in ensuring the reliability, scalability, and performance of our services, leading efforts in building and maintaining a robust infrastructure, automating processes, and guiding the team to implement best practices in site reliability. This role involves on-call production support, monitoring systems, automating processes, incident management, and collaboration with software engineers.
Join our dynamic IT team as a Mid-Level Site Reliability Engineer (SRE 2). Ensure the reliability, availability, and performance of our services. Troubleshoot incidents, automate processes, and collaborate with software engineers to enhance system performance. Implement security best practices to protect our systems and data.
Pythian is building a next-generation Site Reliability Engineering team, and weβre looking for talented, motivated engineers who thrive in fast-paced, problem-solving environments. As an SRE, youβll design, deploy, and operate large-scale distributed systems across compute, storage, networking, and AI/ML environments. Youβll lead projects from architecture to automation to intelligent monitoring, collaborating with both clients and teammates to build resilient, high-performing infrastructure.
As one of the first joiners to our Reliability Engineering Team at ClickHouse, you will be responsible for building and leading processes to ensure the reliability, availability, scalability, and performance of our cloud infrastructure that runs ClickHouse databases. You will collaborate with different teams and guide them to design and implement scalable, secure, highly available and fault-tolerant distributed systems.