Join our Infrastructure Engineering team as an Infrastructure Engineer focusing on Observability! In this role, you'll design and operate metrics, logs, traces, and alerting pipelines, providing actionable insights for internal teams and external customers, ensuring reliability and transparency at scale.
Job listings
Assist with small development, automation, and system integration tasks. You will learn and explore integration patterns and cloud services with guidance from experienced team members, as well as gain exposure to modern architectures, cloud tools, and DevOps workflows. The company is expanding their team and is looking for individuals who are eager to learn and grow their skills in modern integration and cloud technologies.
As a Senior engineer on the Platform Engineering IaCF team at Twilio, you will play a pivotal role in reducing variability in infrastructure provisioning and ensure a consistent, high-quality environment for our cloud-native infrastructure. This role offers an exciting opportunity to contribute to building cutting-edge developer platforms within a dynamic and rapidly growing organization.
Become deeply familiar with all the corners of a critical SaaS platform utilized by millions of customers daily. Work to navigate a significant replatforming initiative, seamlessly moving dozens of critical components between container orchestration systems with zero downtime or customer impact. Identify, understand, and automate away manual processes through clever code and smart architecture. Support a 24x7 online environment as part of a global on-call rotation.
As the Site Reliability Engineer, you will own and manage Nametagβs production cloud infrastructure, our continuous delivery infrastructure, and observability efforts. Working across the stack, youβll collaborate with teams to ensure that our systems meet the highest standards of performance and security while supporting our mission to provide trusted identity solutions.
Swiftly is looking for a DevOps Engineer to scale multi-tenant applications and take ownership of their work. The ideal candidate will be versatile, intellectually curious, and able to find the best tools to solve problems. Responsibilities include designing and driving consensus for the operational infrastructure, writing and maintaining infrastructural services, and ensuring uptime and scalability.
In this role, the Site Reliability Engineer (SRE) will be responsible for managing and resolving the most challenging issues for the ServiceNow SRE team, focusing on instance performance, reliability, and availability. This is a swing shift role (4 days a week) and the candidate must be located within the Republic of Ireland.
Design, deploy, and operate large-scale distributed systems across compute, storage, networking, and AI/ML environments. Lead projects from architecture to automation to intelligent monitoring, collaborating with both clients and teammates to build resilient, high-performing infrastructure. You'll operate and optimize Kubernetes clusters, Istio service mesh, and Linux-based systems, automating workflows using Go, Python, and Shell scripting.
The Cloud Infrastructure Engineering team builds and manages the foundational blocks of ClickHouse Cloud data plane end-to-end. As a software engineer on the team, you will be responsible for designing, deploying, and maintaining ClickHouse's infrastructure, architecting and building a robust, scalable, and highly available distributed infrastructure, and building a cutting-edge cloud-native platform on top of the public cloud.
Work on foundational infrastructure and tackle complex technical challenges with industry leading experts. Your contributions will shape flagship products like Rollup-Boost while driving forward essential open-source innovations. Lead development of high-performance backend infrastructure for rollups, including block builders, sequencers, and decentralized execution environments powered by TEE technology.