Source Job

US Canada Ireland UK Mexico Argentina

  • Partner with engineering leadership, EMs, and Product Managers to define and deliver AI products.
  • Architect scalable, high-performance systems that support a growing number of AI-powered products.
  • Drive technical strategy and make architectural decisions that compound - enabling the team to ship more AI experiences faster.

AWS GCP Kubernetes Terraform Pulumi

20 jobs similar to Senior Staff Engineer - Infrastructure and Architecture

Jobs ranked by similarity.

Global Unlimited PTO

  • Lead infrastructure initiatives across the engineering organization.
  • Design technical quality bar and architectural standards.
  • Build platforms and AI-enabled systems for multiple teams.

Fieldguide is automating and streamlining the work of assurance and audit practitioners specifically within cybersecurity, privacy, and financial audit, building software for the people who enable trust between businesses. They are based in San Francisco, CA, but built as a remote-first company with an inclusive, driven, humble and supportive team.

Australia New Zealand

  • Building world-class AI infrastructure to support a 100+ person research team.
  • Designing and scaling multi-cloud systems that support high-performance model training and inference.
  • Improving monitoring, alerting and system observability for AI workloads

Canva is redefining how the world experiences design. It has campuses in Sydney and Melbourne, and co-working spaces in other major cities, trusting employees to choose the balance that empowers them and their team to achieve their goals.

  • You own uptime, observability, incident response, and root cause analysis.
  • Own the AWS architecture.
  • Make ML pipelines reliable.

Ferra is building AI infrastructure for structural steel estimation. They process large-scale construction drawing PDFs, run computer vision + LLM pipelines, and generate structured steel graphs, takeoffs, and export-ready models. The team is small and technical, which means high ownership, fast decisions, and work has a direct impact on the core product.

US

  • Design and evolve the backend architecture that powers our AI-driven acquisition systems.
  • Own cloud architecture in GCP/AWS/Digital Ocean (multi-environment, production-grade systems).
  • Collaborate with Data and Product teams to productionize intelligent workflows.

Home Solutions is building an AI-powered customer acquisition platform that combines voice agents, intelligent routing, scheduling systems, and partner integrations into a unified operating system for growth. The company targets the rapidly digitizing home services vertical and matches homeowners with the right service provider.

US

  • Define and evolve the technical vision for AI and agentic systems across products.
  • Design orchestration, data, and serving patterns that handle global scale with reliability.
  • Collaborate with AI Research to turn prototypes into extensible, governed production frameworks.

KnowBe4 is a cybersecurity company that puts security first, empowering over 70,000 organizations worldwide to strengthen their security culture. They value radical transparency, extreme ownership, and continuous professional development in a welcoming workplace that encourages all employees to be themselves.

$165,000–$200,000/yr
US Unlimited PTO

  • Contribute to building and operating the infrastructure that supports the HackerOne platform.
  • Improve the reliability, security, and scalability of our systems.
  • Design and operate highly available cloud systems and apply best practices for reliability, observability, and security.

HackerOne is a global leader in Continuous Threat Exposure Management (CTEM). The HackerOne Platform unites agentic AI solutions with the ingenuity of the world’s largest community of security researchers to continuously discover, validate, prioritize, and remediate exposures across code, cloud, and AI systems. They combine the ingenuity of the largest security research community with a best-in-class AI-powered platform, trusted by the world’s top organizations.

US Unlimited PTO 12w maternity 12w paternity

  • Design, implement, and maintain cloud-based infrastructure using AWS, Azure, or GCP.
  • Build, optimize, and manage continuous integration and continuous deployment (CI/CD) pipelines.
  • Integrate AI-powered tooling into engineering workflows to accelerate delivery and improve code quality.

Givebutter is a nonprofit fundraising and CRM platform. They empower millions to raise more, pay less, and give better by offering tools like fundraisers, donation forms, donor management, emails, and text blasts all in one place.

$120,000–$140,000/yr
US Unlimited PTO

  • Architect and manage scalable cloud infrastructure within AWS.
  • Implement and maintain infrastructure using Terraform.
  • Develop automation scripts to improve operational efficiency.

Attune empowers insurance agents with their technology solutions. We foster a remote-first culture and value employee development.

Global

  • Define and execute a technical vision for Onebrief’s infrastructure.
  • Design and evolve a deployment strategy focused on AWS and on-prem.
  • Build security and compliance directly into the infrastructure lifecycle.

Onebrief provides collaboration and AI-powered workflow software designed specifically for military staffs, making them faster, smarter, and more efficient. They have raised $320m+ from top-tier investors and are valued at $2.15B, with a team spanning veterans and technologists.

  • Design, develop, and implement platform solutions that enhance the reliability, security, and scalability of the Database Platform infrastructure.
  • Provide technical leadership in AWS cloud infrastructure, networking, CI/CD, and security for cloud infrastructure solutions.
  • Mentor and coach team members, fostering a culture of knowledge sharing, technical excellence, and continuous improvement.

SYSTABUILD is building a shared cloud and platform foundation for a group of leading software companies in the construction, CAD and ERP domain. They are looking for a Lead Cloud Infrastructure Engineer to take a key role in designing, operating, and evolving their central cloud infrastructure and platform services.

Global Unlimited PTO

  • Build and maintain Infrastructure as Code to power our production systems, Python tools to automate toil, and monitoring systems to detect problems early.
  • Independently execute on large DevOps projects such as major migrations, product rollouts, and infrastructure enhancements
  • Participate in the infrastructure on-call rotation & incident response process, including triaging alerts, coordinating responders, and contributing to blame-free RCAs. Leverage senior level expertise to drive rapid resolutions.

Super.com aims to maximize the lives of both customers and employees, providing opportunities to unlock potential through learning and impact. They are a fast-paced, high-growth tech company that values career progression and supports employees through various programs.

Europe

  • Own the reliability, scalability, and performance of Peec AI’s core systems and infrastructure
  • Design, build, and maintain the tooling, automation, and monitoring that keep our services fast, secure, and highly available
  • Partner closely with product and engineering teams to ensure new features are reliable, observable, and easy to operate from day one

Peec AI is one of Europe’s fastest-growing Series A startups (no employee count/culture details given). They provide exciting and challenging work in the AI space.

$146,200–$212,000/yr
US Unlimited PTO

  • Collaborate with service engineering teams to design, implement, and maintain scalable and resilient infrastructure solutions.
  • Implement SRE principles to improve system reliability and reduce downtime.
  • Improve developer workflows by creating self-service tools, optimizing CI/CD pipelines, and enhancing deployment processes.

Flex is a growth-stage FinTech company creating the best rent payment experience. They empower renters with flexibility over their most significant recurring expense and are growing quickly with a focus on building an inclusive culture.

US Canada

  • Maintain tooling, libraries, and infrastructure leveraged by core service teams
  • Develop and maintain infrastructure services that enable engineers to manage, deploy, and scale systems
  • Act as a technical leader, guiding core service teams to design robust and reliable software

StackAdapt is a technology company that empowers marketers to reach, engage, and convert audiences with precision. They are an AI-powered platform connecting brand and performance marketing, recognized for their diverse workplace and high-performing campaigns.

$150,000–$215,000/yr
Unlimited PTO

  • Drive the design, implementation, and evolution of core platform infrastructure and shared services.
  • Lead mission-critical projects to build and scale foundational platform capabilities.
  • Work closely with engineers across the organization to adopt shared infrastructure and evolve existing and new products.

Vannevar is a defense technology company building AI to deter adversaries. They are a small agile team combining world-class engineers with veteran strategists, experiencing rapid growth and mission impact.

Europe

  • Design and maintain scalable, fault-tolerant infrastructure that supports our SaaS platform and keeps pace with business growth.
  • Define, document, and maintain SLIs, SLOs, and SLAs in partnership with product engineering, translating business commitments into technical guardrails.
  • Lead incident response with steady judgment, facilitate blameless postmortems, and drive remediation efforts that prevent recurrence.

Fixify is on a mission to reimagine IT teams support companies. They need a Senior Site Reliability Engineer who finds joy in building systems that fade into the background, empowering product engineers to ship with confidence and their customers to work without interruption.

$202,617–$202,617/yr
US Canada

  • Co-create the technical architecture, design patterns, and best practices for the AI Application Modernization Factory.
  • Act as a hands-on Principal Engineer, making significant code contributions, performing complex code reviews, and serving as the highest technical escalation point for engineering challenges.
  • Drive the technical strategy for modernizing legacy applications to a cloud-native, microservices, or serverless architecture across major cloud providers (AWS, Azure, GCP).

Banyan Software provides a permanent home for successful enterprise software companies, their employees, and customers. They acquire, build, and grow enterprise software businesses with dominant positions in niche vertical markets and were named the #1 fastest-growing private software company in the US on the Inc. 5000.

Europe

  • Design and evolve the architecture of a cross-product platform that serves as the foundation for AI-driven software development.
  • Define architectural principles, standards, and guidelines for platform services and shared foundations.
  • Design integration patterns and interfaces between platform services, developer tools, and external systems.

JetBrains is building an AI-native platform for software development that connects developer workflows, team-level collaboration, and organizational control into a single coherent system. This platform will serve as the execution and governance layer for AI-driven development.

Latin America

  • Evaluate libraries, tools, and services for suitability.
  • Design, implement, and build scalable architecture leveraging AWS services with an emphasis on reliability, security, and cost efficiency.
  • Learn, coach, and share knowledge and skills with other team members.

Caylent is a cloud native services company that helps organizations bring the best out of their people and technology using Amazon Web Services (AWS). They are a fully remote global company with employees in Canada, the United States, and Latin America that fosters a community of technological curiosity.

  • Maximize the velocity of our product engineering team.
  • Ensure platform scalability, reliability, and security.
  • Champion best practices and shape the engineering culture.

They are building a robust, scalable trading platform to serve high-traffic, latency-sensitive applications. They leverage state-of-the-art technologies to support real-time trading while providing unparalleled reliability and performance.