Source Job

Canada Global

  • Lead, mentor, and foster a healthy, high-performing globally distributed engineering team.
  • Own the execution and delivery of highly critical, complex yearly roadmap items centered around large-scale foundational infrastructure upgrades, high availability, and platform resilience.
  • Own and drive the change management processes across engineering and product domains.

Kubernetes Terraform PostgreSQL Prometheus Grafana

20 jobs similar to DevOps Team Lead

Jobs ranked by similarity.

Europe

  • Lead the Infrastructure Engineering team, taking full ownership of cloud infrastructure, Kubernetes platforms, DevOps tooling, and CI/CD pipelines.
  • Drive reliability, scalability, and security across the production environment while maintaining a sharp focus on developer velocity and business impact.
  • Mentor and guide engineers across SRE, DevOps, and Database Reliability functions, fostering a culture of operational excellence and pragmatic problem-solving.

Finom is a European tech startup headquartered in Amsterdam, revolutionizing financial services for entrepreneurs with an all-in-one B2B platform. They have raised $346 million, are expanding across key EU markets, and foster innovation, prioritizing research and solutions that benefit users, employees, partners, and the business.

Canada EMEA Unlimited PTO

  • Evolve ArgoCD GitOps standards across environments
  • Build reusable Terraform modules and practices for safe, repeatable cloud infrastructure provisioning and drift detection
  • Lead the operation and evolution of production-grade Kubernetes clusters across cloud environments

GitLab is the intelligent orchestration platform for DevSecOps. More than 50 million registered users and more than 50% of the Fortune 100 trust GitLab to ship better, more secure software faster.

Global

  • Lead, mentor, and grow the team of Platform Engineers.
  • Partner with Cybersecurity, Product, and Engineering teams.
  • Drive standards and frameworks for Infrastructure as Code.

Onebrief provides collaboration and AI-powered workflow software specifically for military staffs. The company aims to make staff faster, smarter, and more efficient. Founded in 2019, Onebrief's team spans veterans from all forces and global organizations, and technologists from leading-edge software companies.

Global

  • Own the end-to-end lifecycle (design, provisioning, upgrades, and decommissioning) of core platform components.
  • Lead the design and implementation of infrastructure bootstrap orchestration, including: Automated cluster and environment provisioning.
  • Apply and promote SRE practices across the platform, including: Clear ownership and runbooks for platform components.

Pismo provides a comprehensive processing platform for banking, card issuing and financial market infrastructure and helps customers innovate and build the next generation of banking and payment solutions. Pismo’s 500+ employees are located in more than 10 countries around the world.

US Canada

  • Own SLI/SLO/SLA definitions for the Akuity SaaS platform and drive continuous improvement.
  • Participate in an on-call rotation and act as incident commander for high-severity production events.
  • Partner with engineering teams to build reliability into new features before they ship to production

Akuity helps enterprises ship software faster and more reliably with modern GitOps best practices. The Akuity Platform enables teams to manage the development and deployment across hundreds – if not thousands – of Kubernetes clusters from a single control plane.

India

  • Leading infrastructure strategy and driving DevOps best practices across the engineering organization
  • Helping engineers build reliable products by improving infrastructure and application monitoring, alerting, and tooling
  • Building tools and frameworks that help developers better understand and debug their systems and data

Aspire provides influencer marketing software and services for social commerce. They have helped brands build and manage relationships with millions of influencers and are trusted by over 800 top brands.

Global Unlimited PTO

  • Guide a globally distributed team in an all-remote setting.
  • Improve deployment patterns and observability for customers.
  • Participate in incident management to support GitLab.com availability.

GitLab is the intelligent orchestration platform for DevSecOps, enabling organizations to increase developer productivity, improve operational efficiency, reduce security and compliance risk, and accelerate digital transformation. With over 50 million registered users, they foster a high-performance culture driven by values and continuous knowledge exchange.

Global

  • Own the full product lifecycle: understand user problems, define technical specs, prototype (AI tools encouraged), and work closely with developers to ship high-quality releases.
  • Collaborate with the Product Lead and CTO to shape and scope features for each 8-week release cycle.
  • Make informed tradeoffs — balancing simplicity, technical feasibility, and long-term sustainability.

Kestra is the universal orchestration platform, open source and declarative, designed to orchestrate data pipelines, IT automation, business workflows, and AI/agentic systems. They are trusted by over 10,000 organizations worldwide with a fast-growing global community.

$230,000–$250,000/yr
US Unlimited PTO 12w paternity

  • Define and evolve reliability standards for the SmarterDx platform.
  • Enhance observability systems (metrics, logs, traces, alerting) to provide actionable insights and reduce mean time to detect (MTTD) and resolve (MTTR).
  • Reduce operational toil through automation, self-healing systems, and improved deployment and rollback mechanisms.

SmarterDx, a Smarter Technologies company, builds clinical AI that is transforming how hospitals translate care into payment. Founded by physicians in 2020, their platform connects clinical context with revenue intelligence, helping health systems recover millions in missed revenue, improve quality scores, and appeal every denial.

South America

  • Own the end‑to‑end lifecycle of core platform components, including cloud infrastructure primitives and Kubernetes clusters.
  • Design platform components to be resilient by default, applying SRE principles like fault isolation and capacity planning.
  • Drive Infrastructure‑as‑Code and GitOps‑first practices to ensure platform components are reproducible and auditable.

Pismo, founded in 2016, provides a comprehensive processing platform for banking, card issuing, and financial market infrastructure, helping customers innovate in banking and payments. With over 500 employees across 10+ countries, Pismo joined Visa in 2024, leveraging Visa’s solutions to advance financial technology.

US Canada 16w maternity

  • Build and deploy computing services and infrastructure in customer environments.
  • Clarify and surface requirements from ambiguous use cases defined by cross-functional stakeholders.
  • Improve reliability and scalability by resolving edge cases, studying failure modes, and writing tests.

Planet designs, builds, and operates the largest constellation of imaging satellites in history. They deliver an unprecedented dataset of empirical information via a revolutionary cloud-based platform to authoritative figures in commercial, environmental, and humanitarian sectors. Planet has a people-centric approach toward culture and community and it strives to iterate in a way that puts their team members first and prepares their company for growth.

$103,174–$117,720/yr
Canada

  • Lead efforts to scale and improve our infrastructure.
  • Develop and support internal team tooling.
  • Troubleshoot, debug and resolve issues as part of a shared on-call rotation.

Lillio, formerly HiMama, empowers early childhood educators through innovative tools. They are a Series B, private-equity backed company recognized as an industry leader and selected in 2025 by Time Magazine as one of the world's top EdTech companies.

$119,900–$193,200/yr
Global

  • Build, maintain, and release C8 distributions for our self-managed customers.
  • Improve reliability and usability of our deployment artifacts.
  • Collaborate cross-functionally with Development Teams, Product Management, Quality Assurance, Documentation, and Support.

Camunda is the leader in enterprise agentic automation, orchestrating complex business processes across agents, people, and systems. They are a fully remote, global company named a Great Place to Work, growing fast and looking for top talent to join their team.

North America

  • Provide strategic leadership for the CPE organization, spanning cloud infrastructure, platform services, and operational enablement.
  • Lead and develop CPE managers and senior technical leaders, setting clear expectations for execution, quality, and delivery.
  • Ensure platform reliability, scalability, and security through strong operational processes, observability, and incident management.

Kinaxis is a global leader in end-to-end supply chain management, enabling supply chain excellence for all industries. They have over 2000 employees around the world and are working towards solving some of the biggest challenges facing supply chains today.

Global

  • Ensure the availability, reliability, performance, and security of our SaaS platform
  • Lead infrastructure automation efforts using Infrastructure as Code and Configuration Management tools
  • Define and monitor SLAs/SLOs/SLIs, and drive service quality improvements

Remote People builds the infrastructure to power borderless teams. Their technology enables businesses to hire anyone anywhere compliantly at the push of a button. They are committed to building a global, diverse team representing different and varied backgrounds, perspectives, and experiences.

Canada

  • Design and manage CI/CD and deployment pipelines.
  • Collaborate with product teams to implement cloud best practices.
  • Automate code changes, testing, and analysis using CI tools.

Jobgether is a platform that uses AI to match candidates with jobs. They ensure applications are reviewed quickly, objectively, and fairly against the role's core requirements.

LATAM Unlimited PTO

  • Tech lead two teams (DevEx and Cloud Infrastructure) totaling 6–8 engineers: set technical direction, review key designs/changes, and raise engineering standards across both domains.
  • Own the delivery toolchain end-to-end (Git, CI, deployments/releases): reduce flakiness, improve build/test times, make releases repeatable with clear rollback, and drive adoption of org-wide standards through tooling, docs, and supported migrations.
  • Improve the software development lifecycle (setup → build/test → PR → deploy → observe) and standardize environments so teams spend less time on tooling and more time shipping.

Traackr is a global SaaS technology company providing a data-driven influencer marketing platform that marketers use to optimize investments, streamline campaigns, and scale programs. They are a remote-first company with offices in San Francisco, New York, Boston, Paris, and London and operate on a culture of mutual respect.

Europe

  • Own the container-based application lifecycle, bi-weekly releases, and CI/CD pipelines for GMS.
  • Manage deployments on customer-isolated Kubernetes clusters running stateful applications.
  • Ensure high availability and performance by meeting contractual SLAs through proactive monitoring and alert response.

Planet designs, builds, and operates the largest constellation of imaging satellites in history, delivering data via a cloud-based platform. They are both a space company and data company with a people-centric approach, striving to put their team members first.

US

  • Design and maintain scalable cloud environments using tools like Terraform, CloudFormation, or Ansible.
  • Build and optimize automated deployment pipelines to ensure rapid and reliable software delivery.
  • Implement robust monitoring, logging, and alerting frameworks to ensure 24/7 system health.

CodeRoad offers end-to-end software development services, helping businesses scale with infrastructure solutions. They provide staff augmentation, dedicated IT teams, and software engineering to empower businesses in a digital landscape.

Global Unlimited PTO

  • Guide the technical vision for GitLab’s cloud-native, self-managed deployments and upgrade workflows.
  • Design and maintain Kubernetes Operators, Helm charts, and upgrade orchestration tooling for self-managed GitLab deployments.
  • Drive observability, testing, performance, and resilience practices for self-managed deployments.

GitLab is the intelligent orchestration platform for DevSecOps, enabling organizations to increase developer productivity, improve operational efficiency, reduce security and compliance risk, and accelerate digital transformation. With over 50 million registered users, GitLab emphasizes AI integration and a high-performance culture driven by values and continuous knowledge exchange.