Lead a high-impact CloudOps and infrastructure engineering team powering large-scale, real-time advertising systems under extreme performance and reliability constraints.
Own planning and delivery processes including sprint planning, backlog prioritization, execution tracking, and team retrospectives.
Drive initiatives to improve system reliability, observability, deployment safety, incident response, and production readiness.
AWSDevOpsDistributed SystemsCI/CDInfrastructure As Code
Own and improve infrastructure, deployment systems, and operational foundation for reliability and security.
Build safer deployment paths, strengthen observability, and lead infrastructure migrations.
Partner with engineers on scaling, error handling, and backend changes to support AI-enabled workflows.
Clever is a venture-backed real estate technology company that builds a leading online education platform and has earned a 4.9 TrustPilot rating. The company has helped consumers save over $210 million in real estate fees and fosters a culture of innovation and transparency.
Own and evolve AWS infrastructure using Terraform, managing EKS clusters, databases, and core services.
Maintain CI/CD reliability and developer tooling across the full engineering org.
Lead incident response, drive post-incident reviews, and improve monitoring and alerting standards.
Babylist is the leading platform for expecting and new families, helping parents feel confident, connected, and cared for at every step. As a modern, AI-forward tech company with over 10 million yearly shoppers, Babylist has expanded into a full ecosystem and generated $750M in revenue in 2025, reshaping the $235B kids and baby market.
Own and operate production cloud environments, ensuring high availability, reliability, and performance across distributed systems.
Design, build, and maintain scalable infrastructure using automation-first principles and Infrastructure as Code practices.
Drive automation initiatives and continuous improvement across infrastructure, deployment, and operational workflows.
Jobgether is an AI-powered job matching platform that connects candidates with hiring companies. They have an inclusive, employee-driven culture with a strong focus on collaboration and innovation.
Deploy, manage, and maintain AWS infrastructure across development, staging, and production environments.
Build and maintain scalable, reusable and secure Infrastructure as Code (IaC) using Terraform Enterprise.
Develop, implement and manage CI/CD pipelines for automated application and infrastructure deployments.
Miratech helps visionaries change the world. We are a global IT services and consulting company that brings together enterprise and start-up innovation. They retain nearly 1000 full-time professionals, and their annual growth rate exceeds 25%.
Lead and grow high-performing platform engineering teams.
Set technical direction and drive multi-quarter platform initiatives.
Design and evolve internal platforms for product teams.
Vanta's mission is to help businesses earn and prove trust by making security continuous and verifiable. They empower companies to practice better security, automating security monitoring for compliance standards. Vanta has a kind and talented team.
Automate operational tasks using scripting languages.
Implement configuration management using tools like CloudFormation, Terraform, and Ansible.
Peraton is a next-generation national security company that drives missions of consequence spanning the globe. They deliver trusted, highly differentiated solutions and technologies to protect our nation and allies, operating at the critical nexus between traditional and nontraditional threats.
Implement and manage AI-powered tools, copilots, and workflow automations from POC to production, owning the full technical lifecycle.
Design, deploy, and maintain cloud infrastructure on AWS and Azure, including IAM, VPCs, security groups, multi-account strategies, and cost optimization.
Own reliability, observability, and security controls across all AI and cloud services, including incident response, debugging complex multi-service environments, and driving continuous improvement.
Dragos is dedicated to arming customers with best-in-class technology, threat intelligence, and services to protect their systems. They're a remote-first culture with operations in North America, Europe, the Middle East, and APAC, looking for mission-oriented teammates who embody their core values of authenticity, transparency, and trust.
Manage a team of Engineers, conducting 1:1s, performance reviews, hiring, and career development in a distributed remote friendly environment.
Own the technical roadmap for shared cloud infrastructure across Azure and AWS, balancing reliability work against longer-term platform improvements.
Set and enforce standards for infrastructure-as-code (Terraform, Helm, Kubernetes), documentation, and operational readiness.
Delinea is a pioneer in securing human and machine identities through intelligent, centralized authorization, empowering organizations to seamlessly govern their interactions across the modern enterprise. They value diversity, innovation, and a culture of respect and fairness, with a global team supported by strategic investment from TPG.
Own and maintain the reliability, performance, and availability of large-scale production systems.
Design, build, and improve CI/CD pipelines using Azure DevOps, GitHub Actions, Jenkins, and Octopus Deploy.
Drive cloud cost optimization, scalability, and auto-scaling initiatives across hosted environments.
Encoura empowers students and institutions to create meaningful connections so everyone can make the most informed decisions to achieve their goals. Since 1972, the company has evolved its products and services to better represent the link between students and higher education institutions and to create the highest probability of student success.
Design and evolve AWS architecture for scalability, reliability, and cost efficiency.
Standardize infrastructure patterns and implement monitoring, alerting, and CI/CD pipelines.
Diagnose and resolve production incidents, lead root-cause analysis, and communicate project updates.
Connection is a technology solutions provider offering managed services, staffing, and IT solutions. This role is part of their Technical Staffing division, focusing on contract-to-hire placements.
Lead and grow high-performing platform engineering teams that deliver reliable, scalable infrastructure and operational excellence for Vanta’s products and customers.
Set technical direction and drive multi-quarter platform initiatives spanning infrastructure reliability, security, scalability, and developer experience across shared systems and services.
Partner closely with product engineering, security, and engineering leadership to identify organizational needs and deliver scalable platform solutions.
Vanta helps businesses earn and prove trust by empowering companies to practice better security and prove it with ease. They have a kind and talented team, and while some have prior security experience, many have been successful without it.
Design, build, and operate core cloud infrastructure across compute, storage, databases, and networking layers.
Own and improve the reliability, scalability, and security of Valon’s production systems as we scale to support major enterprise deployments.
Evaluate, adopt, and operationalize new infrastructure technologies (e.g., Vitess, Clickhouse, Redis) to meet evolving product and scale requirements.
Valon is building the AI-native operating system for regulated finance, starting with mortgage servicing. They're a Series C company backed by a16z, transforming industries that others have written off as too complex to innovate.
Designing and managing cloud-based infrastructure on AWS.
Creating and maintaining deployment architectures and continuous delivery pipelines.
Automating infrastructure provisioning and management using Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.
Nearform is an independent team of data & AI experts, engineers, and designers who build intelligent digital solutions and capability at pace. Our team of 500 experts in 20+ countries is trusted by leading enterprises.
Design and build scalable infrastructure to support rapid growth in data volume, service usage, and engineering velocity.
Implement and maintain core security infrastructure and controls including, service-to-service authentication, secrets management, application security primitives.
Partner closely with Security Engineering to implement infrastructure that supports best-in-class security and compliance practices.
Vanta helps businesses earn and prove trust by providing a platform that continuously monitors and verifies security. They empower companies to practice better security and prove it with ease. Vanta has a kind and talented team with offices in SF, NYC, London, Dublin, Tel Aviv, and Sydney.
Act as a first responder for system incidents and outages, ensuring high availability and performance.
Own and evolve monitoring, alerting, and log management systems while optimizing database infrastructure.
Collaborate with engineering teams to build scalable, resilient systems and contribute to SRE tooling and automation.
Circle is building the world's leading all-in-one platform for online communities. We're a fully remote company of around 200 team members from 30+ countries, with a culture that values autonomy, async collaboration, and high expectations.
Design and develop CI/CD systems for websites, services, and release workflows, and operate an EKS-based Kubernetes platform.
Diagnose debug production incidents, drive root-cause analysis, and implement improvements to enhance system reliability.
Write and maintain infrastructure as code using Pulumi or Terraform/OpenTofu across multiple AWS accounts with security-conscious practices.
Thunderbird is one of the world’s most trusted open-source email applications, empowering more than 20 million people globally. Our small but growing distributed team includes 65+ people across seven countries, and we build privacy-respecting communication tools with a collaborative, inclusive, and user-first spirit.
Provide day-to-day support, administration, and monitoring of clients’ AWS cloud infrastructure.
Assist in designing and developing automation solutions for monitoring, scaling, and managing cloud workloads.
Troubleshoot issues related to compute, storage, networking, IAM, and deployments in AWS.
AHEAD builds platforms for digital business by weaving together advances in cloud infrastructure, automation, analytics, and software delivery. They help enterprises deliver on the promise of digital transformation and prioritize creating a culture of belonging, where all perspectives and voices are represented, valued, respected, and heard.
Design, build, and maintain CI/CD pipelines and Infrastructure as Code using tools like CloudFormation, Ansible, and Terraform.
Monitor and respond to infrastructure and application health, troubleshoot operational issues, and provide on-call support.
Maintain operational documentation, communicate proactively with teams, and ensure service delivery meets client expectations.
NICE Ltd. provides software used by 25,000+ global businesses, including 85 of the Fortune 100, to deliver customer experiences, fight financial crime, and ensure public safety. With over 8,500 employees across 30+ countries, NICE is recognized as a market leader in AI, cloud, and digital innovation.
Own and evolve the cloud platform including compute layer, EKS fleet, serverless infrastructure, networking, and cloud operations across AWS and GCP.
Design and maintain infrastructure-as-code foundation and networking layer for reliability, security, and scalability.
Build AI-powered automation for cloud infrastructure management, including policy-as-code, drift detection, and LLM-assisted runbook generation.
Webflow builds the world's leading AI-native Digital Experience Platform, empowering teams to design, launch, and optimize for the web without barriers. As a remote-first company with over 2 million users across 190 countries, it fosters a culture of trust, transparency, and creativity.
Lead the Site Reliability Operations team, overseeing observability, monitoring, incident response, and operational excellence for key enterprise services.
Partner with product, engineering, and infrastructure teams to embed CI/CD and release best practices, automating build/test/deploy and release monitoring.
Own problem management, driving root cause analysis and corrective actions to improve system resilience and reduce incident impact.
Mercury Insurance helps people reduce risk and overcome unexpected events, serving customers for over 60 years. They are a midsize employer recognized as one of America's Best Midsize Employers for 2026, with a collaborative culture focused on growth and inclusion.