Source Job

US

  • Own and operate end-to-end infrastructure for backend services, frontend systems and databases.
  • Build and maintain reliable deployment workflows including CI/CD pipelines and rollback procedures.
  • Improve system-wide observability through metrics, logging, alerting, and monitoring to ensure uptime.

Terraform Ansible Linux Grafana

20 jobs similar to Infrastructure Engineer

Jobs ranked by similarity.

$113,850–$126,500/yr
Europe 5w PTO

  • Design, build, and maintain infrastructure using Infrastructure as Code tools such as Terraform.
  • Improve system reliability, scalability, resilience, and performance across the Mast platform.
  • Build systems and tooling that automate infrastructure management and operational workflows wherever possible.

Mast is on a mission to make complex lending simple by building modern, cloud-native lending technology purpose-built for specialist lenders. It is a high-performance team of engineers and lending experts that values radical honesty, transparency, and speed.

$120,000–$170,000/yr
Global Unlimited PTO

  • Own and evolve Quansight's cloud infrastructure across AWS, Azure, and GCP.
  • Build, deploy, and maintain internal dashboards and reporting for operations and project management.
  • Lead infrastructure engagements for clients from scoping and architecture through delivery, upskilling client teams.

Quansight is rooted in the Python and PyData ecosystems. They provide services ranging from open-source software development to training and consulting, believing in a culture of do-ers, learners, and collaborators.

$210,000–$278,000/yr
US Unlimited PTO

  • Architect future iterations of core systems, addressing scaling requirements.
  • Design and implement developer tools to enhance deployment safety and reproducibility.
  • Drive excellence in monitoring and guide incident response for quick issue resolution.

Found provides tools for self-employed individuals, offering a business bank account that automates taxes and expense tracking. They aim to give self-employed people the security and peace of mind historically available only at large corporations and are looking for kind, resourceful, and passionate people.

$115,200–$172,800/yr
US 8w paternity

  • Build internal tooling to help other engineers and the rest of the company understand and operate our system.
  • Design and implement security best practices for our team and infrastructure.
  • Reduce toil through automation, including building and maintaining CI/CD infrastructure.

Openly is rebuilding insurance from the ground up by re-envisioning and enhancing every aspect of the customer experience. They are a rapidly growing team of exceptional, curious, empathetic people with a wide range of skill sets, spanning many departments.

Global

  • Build and maintain our host provisioning stack to bring new bare metal online quickly and confidently.
  • Evolve our homegrown orchestration engine to manage clusters, containers, and VMs.
  • Build out internal observability and alerting so we catch fleet problems before customers feel them.

Railway's core mission is to make software engineers higher leverage. They provide powerful tools so engineers can spend less time setting up and more time doing. The team is small, with high ownership, and they are passionate about being exceptional.

Global Unlimited PTO

  • Improve the reliability, performance, and scalability of our production platform.
  • Operate reliable infrastructure, improve observability, and drive incident response.
  • Use data-driven reliability practices such as SLIs, SLOs, SLAs, and DORA metrics.

VRChat is a game-changing platform that provides an endless collection of social VR experiences. They empower their community to bring their imaginations to life and help shape the metaverse. Their team includes people from Netflix, Twitter, Meta, and Microsoft.

$145,000–$250,000/yr
US Unlimited PTO

  • Construct infrastructure as code, developing and enforcing best practice across configurations while preventing drift between Terraform configurations and infrastructure deployments.

SentiLink provides innovative identity and risk solutions, empowering institutions and individuals to transaction with confidence. They are building the future of identity verification in the United States replacing a clunky, ineffective, and expensive status quo with solutions that are 10x faster, smarter, and more accurate.

$205,000–$235,000/yr
US

  • Provide technical leadership for infrastructure, reliability, and observability.
  • Own the observability stack using Datadog and CloudWatch.
  • Design and evolve AWS infrastructure for reliability, security, scalability, and cost efficiency.

Topstep is an engaging working environment that ranges from fully remote to hybrid. They foster a culture of collaboration by keeping cameras on during meetings and maintaining a robust Slack environment for communication.

$29,000–$36,000/yr
India

  • Design, build, and maintain scalable, reliable systems on GCP.
  • Develop automation for infrastructure provisioning using Terraform, Ansible, or Deployment Manager.
  • Manage incident response, conduct postmortems, and implement improvements to reduce recurrence.

SupplyHouse.com is an industry-leading e-commerce company specializing in HVAC, plumbing, heating, and electrical supplies since 2004. They value every individual team member and cultivate a community where people come first with Generosity, Respect, Innovation, Teamwork, and GRIT.

$4,313–$5,391/mo
Europe

  • Own 5 AWS accounts across the organisation.
  • Architect and maintain infrastructure as code with Terraform.
  • Set up monitoring, alerting, and incident response.

We're a UK fintech building high-throughput digital infrastructure for the mortgage and property space. Recently acquired Trussle and we are taking our platform to the next level. The company values innovation and building high-quality products.

Europe 6w PTO

  • Design, build, and operate reconciliation systems to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration.
  • Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient.
  • Improve operational efficiency by reducing deployment complexity and contributing to the Stack Config Reconciliation project.

Grafana Labs is a remote-first, open-source powerhouse with over 20M users of Grafana. They help more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, featuring scalable metrics (Grafana Mimir), logs (Grafana Loki), and traces (Grafana Tempo).

Ireland

  • Design, build, and deploy production systems with a focus on scalability, reliability, observability, and performance.
  • Develop and maintain comprehensive automation solutions to eliminate toil and streamline operational efficiency.
  • Proactively monitor production systems and implement automated incident response mechanisms to minimise downtime.

Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. The company is well-established and profitable with over $8 billion in revenue and values diversity and inclusivity.

$160,000–$200,000/yr
US

  • Drive the stability and reliability of Epic's GCP infrastructure.
  • Manage and harden our Docker and GKE container platform.
  • Maintain and improve CI/CD pipelines.

Epic is the leading digital reading platform for kids ages 12 and under, used by millions of children, families, and educators around the world. As Epic continues to grow, we are reimagining what reading can be through thoughtful technology, data, and global collaboration to make learning more engaging, accessible, and impactful.

Unlimited PTO 16w maternity 16w paternity

  • Scale and mature Vesta’s infrastructure to support the entire mortgage market reliably, securely, and efficiently.
  • Build the foundational systems that power engineering velocity and platform reliability.
  • Focus on cloud architecture, deployment systems, observability, incident response, and internal developer tooling.

Vesta is building the next-generation system of record to power the multi-trillion mortgage market. They value humility, empathy, self-awareness, and an orientation towards action and have raised $45M from top tier investors.

SRE

Fal
$180,000–$250,000/yr
US

  • Own and operate our Kubernetes infrastructure.
  • Build and maintain CI/CD pipelines and deployment infrastructure.
  • Leverage AI to automate analysis and resolution of production issues.

Fal is the generative media ecosystem powering the next generation of AI products. They build the infrastructure, tools, and model access that teams need to move from idea to production.

Canada 4w PTO

  • Design and build scalable infrastructure to support rapid growth in data volume, service usage, and engineering velocity.
  • Implement and maintain core security infrastructure and controls including, service-to-service authentication, secrets management, application security primitives.
  • Partner closely with Security Engineering to implement infrastructure that supports best-in-class security and compliance practices.

Vanta helps businesses earn and prove trust by providing a platform that continuously monitors and verifies security. They empower companies to practice better security and prove it with ease. Vanta has a kind and talented team with offices in SF, NYC, London, Dublin, Tel Aviv, and Sydney.

Unlimited PTO

  • Assess and improve visibility by identifying gaps in dashboards, metrics, and logs.
  • Refine alerts and dashboards for critical services to catch issues earlier.
  • Automate routine checks and monitoring tasks to free up engineers.

PlayOn is where high school sports come to life through platforms like GoFan, NFHS Network, and MaxPreps. As a growth-stage company backed by KKR, we build the technology that powers high school athletics from ticketing and streaming to fundraising and merchandise.

US 5w PTO

  • Build and maintain the platform that runs all Close systems.
  • Automate database lifecycles and eliminate static credentials.
  • Improve our multi-region disaster recovery system and reduce downtime.

Close is a bootstrapped, profitable, and fully remote company with a team of thoughtful individuals. They focus on building a CRM that prioritizes better communication for small scaling businesses and have about 100 employees.

$148,750–$201,250/yr
US Unlimited PTO

  • Designing and operating always-on product environments for customer demos, internal use, and stakeholder access.
  • Building feature branch / preview environments to support UX and rapid feedback loops.
  • Integrating core system components across Fleet Management, Edge Management, OS, and related services.

Defense Unicorns delivers mission value by streamlining software delivery. They are composed of innovators, software engineers, and veterans with decades of experience delivering technology programs across the federal market.

$180,000–$200,000/yr
US

  • Own and evolve a scalable observability platform spanning metrics, logs, traces, and events.
  • Design telemetry pipelines ingesting data from GPUs, CPUs, networking, containers, APIs, and BMC/Redfish.
  • Design and implement noise-resistant alerting systems to improve signal quality and reduce operational load.

Lightning AI builds an end-to-end platform for developing, training, and deploying AI systems, designed to take ideas from research to production with less friction. They combine developer-first software with cost-efficient, large-scale compute, serving solo researchers, startups, and large enterprises.