Spearhead our transition to Platform Engineering, driving innovation, streamlining processes, and fostering a culture of self-service. Guide our team in the evolution from traditional DevOps to a platform-centric approach. Design and implement a robust, self-service platform that empowers developers to accelerate delivery. Work closely with development, infrastructure, and security teams to ensure seamless integration.
Job listings
As a Federal SRE you will help deliver 24x7 support for our Government Cloud infrastructure. This is a 1st shift position and has a 10-hour 4-day work week, Sunday - Wednesday with working shift hours from 7 a.m. β 6 p.m. PT. Our SREs are empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in between.
Lead the team responsible for the operational reliability of our bare metal infrastructure, networking, and system configuration that powers our product offerings in this hands-on "player/coach" role. You will help shape a critical function in a growing company, evolving the Network Operations Center (NOC) into a modern, proactive SRE function that leverages automation, data science, and reliability engineering principles.
Play a key role in shaping the future of our global infrastructure, overseeing a global infrastructure of ~10,000 on-prem servers, youβll tackle unique technical challenges, engineer scalable systems, and have a direct impact on the reliability and performance of our products. Build Reliable Infrastructure, Automate Everything, Ensure Observability, Solve Complex Issues, and Collaborate & Innovate.
As Canva scales change continues to be part of our DNA, designing, building, and maintaining core Kubernetes infrastructure components, extending and supporting tooling to build and manage virtual machines, investigating and resolving system performance and reliability issues. Creating and refining automation to improve reliability and reduce operational overhead and being on-call for the team's products and drive operational excellence.
The Junior DevOps/Automation Engineer will lead, develop, and support automation projects related with Servers deployments and operations, in our global environment. Required skills include: Linux and Windows operating systems support, Ansible, Terraform, bash scripting, power shell, Server Patching, Server OS and Security Hardening. The engineer will work with the Global Cloud Engineering Self Service Platform Automation team, developing, and implementing infrastructure automation both on premise and across public clouds.
As a SeniorSite Reliability Engineer on our Cloud Infrastructure Team, youβll play a pivotal role in maintaining and scaling our ground segment infrastructure. Youβll collaborate across development, operations, and IT to ensure the integration, delivery, and reliability of services that support our space operations on Earth and in orbit. This is an exciting opportunity to work on cutting-edge technology and help build modern automated space infrastructure.
You'll bridge infrastructure and application deploymentβdesigning cloud environments, building CI/CD pipelines, and automating for uptime. You will architect and implement cutting-edge Azure cloud infrastructure, leveraging IaC and CI/CD pipelines, directly impacting critical application scalability and reliability.