The Site Reliability Engineering (SRE) team aims to maximize the engineering velocity of developer teams while keeping products reliable. You will be responsible for the maintenance of sandbox and staging environments and the automation pipeline to ensure continuous testing. Help to develop and spread the DevOps culture, create and maintain development sandbox environments, and automate and orchestrate workloads in cloud environments.
Job listings
Architect and manage a fully mutable, distributed hybrid infrastructure spanning AWS and RunPod. Build a cloud- and provider-agnostic stack that can scale to support millions of users engaging with cutting-edge generative AI and real-time video services. Lead the implementation of a fully mutable, distributed infrastructure across AWS and RunPod.
Take full ownership of infrastructure and DevOps processes for our Data Platform, streamline, automate, and elevate the backend systems that power our analytics and data-driven products. This is a hands-on and collaborative role where youโll have the opportunity to reshape core infrastructure, focusing on AWS, CI/CD systems, database ops, and performance monitoring. Directly support some of the most critical projects in the company.
We are looking for a talented and experienced Software Engineer to join our Data Platform team to play a crucial part in designing, building, and optimizing our platform to support a wide range of data-driven initiatives. You will architect scalable solutions, and implement data solutions and infrastructure for our Data Platform, leveraging data to drive business impact.
The Senior Software Developer will join the Enabling team to make informed suggestions about architectural, tooling, frameworks, and ecosystem choices that affect the tool stack, increasing the autonomy of stream-aligned teams by focusing on problems rather than solutions. Reporting to the Team Lead, Engineering, you will work with the Engineering Department to build the software that powers the Zensurance data collection, rating and pricing engines.
Join Filevine as a Database Reliability Engineer and play a key role in scaling, optimizing, and ensuring the performance of our high-volume SQL Server and PostgreSQL environments. You'll lead with technical excellence, collaborate across teams, and mentor othersโall while working on complex, meaningful challenges that power a critical legal tech platform.
As a Senior Infrastructure Engineer, you'll play a critical role in designing, implementing, and maintaining the core infrastructure that powers our hit mobile games. You'll work closely with development teams to ensure our systems are highly available, performant, and secure, supporting millions of players worldwide. Responsibilities include designing scalable infrastructure on AWS, troubleshooting production issues, and driving continuous improvement.
The Compute and Networking team is the foundation of Attentiveโs engineering organization, focusing on building and operating compute, storage, networking, and deployment systems that power large-scale operations. As an Engineering Manager, you will lead a team responsible for Kubernetes orchestration, networking, and service mesh infrastructure. You'll define and execute a roadmap, partner with other teams, and drive automation and observability improvements.
As Senior Infrastructure Engineer at Notabene, you will play a key role in managing and maintaining our core infrastructure, including AWS and Kubernetes environments. Your expertise will ensure the reliability, scalability, and security of our platform, directly supporting the development teams and enabling seamless deployment and operation of applications. By providing critical infrastructure support and collaborating across teams, you will help drive the stability and growth of our technology foundation.
Be responsible for building and improving our observability platform and tooling, which is used by all Canva engineers. Provide technical leadership and expertise to drive pragmatic solutions and achieve impactful design decisions. Brainstorm, research and prototype to optimize our tracing and exceptions platforms, improve our operational effectiveness and increase reliability. Find ways to improve the use of traces and exceptions, providing better insights to our engineers.