Source Job

$110,000–$216,000/yr
US

  • Contribute to planning and implementation to enhance stability and reliability.
  • Apply software engineering and SRE principles to create tools and automation.
  • Continuously improve, evolve and innovate the observability of Commerce’s platform architecture.

Golang Ruby Terraform SQL

20 jobs similar to Lead Infrastructure Engineer

Jobs ranked by similarity.

US

  • Work as part of a small, cross functional XP team installing Imogen into client cloud environments.
  • Pair program with other engineers and collaborate closely with product managers and designers.
  • Work with client infrastructure, security, and network teams to deploy Imogen to their cloud infrastructure.

Mechanical Orchard builds Imogen, a modernization platform for rewriting business applications. Their Delivery team operates in complex environments and values strong fundamentals, AI, and collaboration.

$172,614–$172,614/yr
US

  • Design infrastructure, networking, and software platform architecture.
  • Build and maintain automation of Continuous Integration and Continuous Deployment pipelines.
  • Troubleshoot infrastructure, internal applications, networking, and security issues.

Loadsmart is a technology company focused on the logistics and supply chain industry. They leverage data and technology to automate and optimize freight transportation, connecting shippers and carriers to streamline the shipping process. They are a mid-sized company passionate about transforming the future of freight.

Global

  • Build and own the foundational infrastructure that our products run upon.
  • Work directly on our products' golang code base to implement SRE related objectives.
  • Take a data driven approach to quantifying system performance and reliability.

LiveKit provides the network infrastructure for multimodal AI interfaces, enabling seamless audio and visual interactions. Founded in 2021, LiveKit supports over 3 Billion calls annually, with 100,000+ developers and industry giants like OpenAI, Spotify, and Meta.

$165,000–$195,000/yr
US

  • Support and operate Legion’s AWS-based cloud platform and Kubernetes (EKS) environments.
  • Build and maintain infrastructure-as-code using Terraform.
  • Improve CI/CD pipelines to increase deployment safety and velocity.

Legion Technologies delivers the industry’s most innovative workforce management platform. The AI-driven Legion WFM platform maximizes labor efficiency and employee engagement. They are a remote, mission-driven team that embraces a collaborative, fast-paced, and entrepreneurial culture.

$230,000–$250,000/yr
US Unlimited PTO 12w paternity

  • Define and evolve reliability standards for the SmarterDx platform.
  • Enhance observability systems (metrics, logs, traces, alerting) to provide actionable insights and reduce mean time to detect (MTTD) and resolve (MTTR).
  • Reduce operational toil through automation, self-healing systems, and improved deployment and rollback mechanisms.

SmarterDx, a Smarter Technologies company, builds clinical AI that is transforming how hospitals translate care into payment. Founded by physicians in 2020, their platform connects clinical context with revenue intelligence, helping health systems recover millions in missed revenue, improve quality scores, and appeal every denial.

US

  • Design and implement core backend systems and integrations that power the product.
  • Design and build our data platform (orchestration, pipelines, developer workflows)
  • Create scalable systems for data ingestion, transformation, and access

Avantos is building the AI-native operating system for financial services, transforming fragmented data into a single, intelligent system. They power workflows, automation, and decision-making as a product-led, fast-moving team in AI, fintech, and infrastructure.

LATAM Unlimited PTO

  • Tech lead two teams (DevEx and Cloud Infrastructure) totaling 6–8 engineers: set technical direction, review key designs/changes, and raise engineering standards across both domains.
  • Own the delivery toolchain end-to-end (Git, CI, deployments/releases): reduce flakiness, improve build/test times, make releases repeatable with clear rollback, and drive adoption of org-wide standards through tooling, docs, and supported migrations.
  • Improve the software development lifecycle (setup → build/test → PR → deploy → observe) and standardize environments so teams spend less time on tooling and more time shipping.

Traackr is a global SaaS technology company providing a data-driven influencer marketing platform that marketers use to optimize investments, streamline campaigns, and scale programs. They are a remote-first company with offices in San Francisco, New York, Boston, Paris, and London and operate on a culture of mutual respect.

$141,000–$230,000/yr
US

  • Collaborate with engineering teams to design and implement scalable, secure systems.
  • Establish and manage service level objectives (SLOs) and service level agreements (SLAs).
  • Enhance incident response processes and post-mortem analysis for outages.

ClickHouse, recognized on the 2025 Forbes Cloud 100 list, is one of the most innovative and fast-growing private cloud companies. With more than 3,000 customers and ARR that has grown over 250 percent year over year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads.

Canada

  • Design, implement, and maintain scalable platform services and APIs.
  • Build and improve APIs and plugin frameworks enabling partner integrations.
  • Improve platform reliability, observability, and application performance.

HappyCo builds mobile and cloud solutions to enable real-time property data. Their flagship product suite 'Happy Property’ has more than 5 million units on its platform and they strive to build better communities with an inclusive, supportive culture where people can grow their careers.

Global Unlimited PTO

  • Help self-managed customers run GitLab reliably.
  • Make GitLab easier to deploy and more secure by default.
  • Improve installation, upgrade, and day-to-day operations.

GitLab is the intelligent orchestration platform for DevSecOps. They enable organizations to increase developer productivity, improve operational efficiency, reduce security and compliance risk, and accelerate digital transformation. More than 50 million registered users trust them.

$170,000–$240,000/yr
US 4w PTO

  • Own our fundamental cloud services and tooling.
  • Own our application platform.
  • Own our developer experience.

Propel builds technology that strengthens the social safety net. They are a passionate team of ~100 Propellers who envision a future where every American has the tools and resources they need to thrive, offering a remote-first working environment with headquarters in Brooklyn.

  • Designing, building, and operating Kubernetes infrastructure across multiple cloud providers.
  • Building and maintaining automation for cluster lifecycle management, node provisioning, and provider onboarding.
  • Developing platform tooling and abstractions that enable other Canva engineers to deploy and scale workloads.

Canva is a design platform redefining how the world experiences design. They have campuses in Sydney and Melbourne, along with co-working spaces in Brisbane, Perth and Adelaide, offering a flexible and inclusive work environment.

North America Europe Middle East Asia Unlimited PTO

  • Architect, implement, and maintain sophisticated developer tooling, frameworks, and automation to streamline and enhance software development processes.
  • Lead improvements and optimizations of CI/CD pipelines to ensure fast, reliable, and secure software deployments.
  • Proactively identify, troubleshoot, and resolve bottlenecks within development workflows, continuously improving developer productivity and satisfaction.

Galaxy is a global leader in digital assets and data center infrastructure, delivering solutions that accelerate progress in finance and artificial intelligence. Their team blends deep crypto expertise with institutional experience and a shared commitment to shaping the future of Web3 and AI.

US

  • Serve as a primary architect of our CI/CD vision, ensuring delivery speed and compliance posture accelerate together as Aledade scales.
  • Lead the evolution of a "Universal Pipeline" by building automation and guardrails to ensure every deployment is HIPAA-compliant by default.
  • Foster a high-velocity engineering culture where security, compliance, and audit evidence are seamless side-effects of a delivery lifecycle.

Aledade partners with independent practices, health centers, and clinics to build and lead Accountable Care Organizations (ACOs) anchored in primary care. I don't have enough information to comment on the company size or culture.

US Canada 16w maternity

  • Build and deploy computing services and infrastructure in customer environments.
  • Clarify and surface requirements from ambiguous use cases defined by cross-functional stakeholders.
  • Improve reliability and scalability by resolving edge cases, studying failure modes, and writing tests.

Planet designs, builds, and operates the largest constellation of imaging satellites in history. They deliver an unprecedented dataset of empirical information via a revolutionary cloud-based platform to authoritative figures in commercial, environmental, and humanitarian sectors. Planet has a people-centric approach toward culture and community and it strives to iterate in a way that puts their team members first and prepares their company for growth.

$133,110–$148,042/yr
US

  • Collaborate with stakeholders to drive best practices for monitoring, CI/CD pipelines
  • Troubleshoot deployment issues in our CI pipeline
  • Identify areas for automation and embrace the codification of all things

Weedmaps is a global leader in the cannabis industry. They are dedicated to transparency, education, and community, serving cannabis to consumers and businesses in the U.S. and worldwide.

Europe

  • Write code, automate everything, design for reliability, and deeply understand the systems.
  • Build or extend Terraform modules and contribute to Platform Engineering around Observability.
  • Collaborate with developers to shape feature design so that reliability is built in, not added later.

InPost Group is an innovative European out of home deliveries company, revolutionizing the way parcels are delivered to customers. With over 10,000 employees worldwide, InPost Group is one of the largest out of home delivery providers in Europe, committed to providing sustainable and efficient delivery solutions.

$198,025–$287,952/yr

  • Building tools and applications to extends Calendly’s infrastructure platform
  • Evaluating and deploying cloud native open source tools
  • Exercising expertise in cloud infrastructure concepts and patterns

Calendly's product powers connections for millions through impactful innovation. They are in the midst of exciting growth and desire people that want to learn, grow, and do their best work.

Global

  • Ensure the availability, reliability, performance, and security of our SaaS platform
  • Lead infrastructure automation efforts using Infrastructure as Code and Configuration Management tools
  • Define and monitor SLAs/SLOs/SLIs, and drive service quality improvements

Remote People builds the infrastructure to power borderless teams. Their technology enables businesses to hire anyone anywhere compliantly at the push of a button. They are committed to building a global, diverse team representing different and varied backgrounds, perspectives, and experiences.

US EMEA

  • Design and implement the complex distributed infrastructure that powers our core AI engine and distributed analysis systems.
  • Tune and optimize cloud services across compute, storage, networking, and observability to drive performance and reliability.
  • Develop our core services, written in TypeScript, Kotlin and Go to support our unique deployment and infrastructure requirements.

XBOW is building the future of offensive security. They create the platform that puts security ahead in the arms race, using AI to autonomously discover, validate, and exploit vulnerabilities. Founded by Oege de Moor, the company is backed by Sequoia, Altimeter, and other leading investors.