Own the SRE roadmap end-to-end, setting priorities independently and driving execution to make the team's impact visible across the organization.
Drive compliance, security, and infrastructure topics for your business unit by identifying risks early and owning the resolution before they escalate.
Lead a 4–6-person generalist SRE team through 1:1s, performance cycles, and meaningful career development while contributing technical credibility to architectural discussions.
Design and implement infrastructure and tools that empower our product teams to rapidly and securely iterate, emphasizing reliability and automation.
Influence the strategic direction of our infrastructure and operational practices, ensuring that we are well-positioned to scale and support our growing organization.
Take a proactive role in the resolution of production issues, ensuring that we are well-prepared to handle incidents and that we learn from them in a blameless manner.
SSV Labs is the core team behind the SSV Network - pioneering decentralized infrastructure for Ethereum staking. They are building tools, protocols, and standards to make staking more secure, scalable, and trustless.
Lead, mentor, and foster a healthy, high-performing globally distributed engineering team.
Own the execution and delivery of highly critical, complex yearly roadmap items centered around large-scale foundational infrastructure upgrades, high availability, and platform resilience.
Own and drive the change management processes across engineering and product domains.
Alpaca is a US-headquartered self-clearing broker-dealer and brokerage infrastructure for stocks, ETFs, options, crypto, fixed income, 24/5 trading, and more. Their global team of 230+ members is a diverse group of experienced engineers, traders, and brokerage professionals fostering a vibrant community.
Manage and support hybrid-cloud infrastructure for the Payward Services business unit, including Nomad, Kubernetes, and databases.
Build automation tooling, maintain CI/CD pipelines, and consult on monitoring and alerting best practices to ensure service reliability.
Provide operational support, participate in incident response, and debug complex distributed system issues across production and staging environments.
Kraken is a mission-focused company building premium crypto products for traders and institutions, dedicated to accelerating global crypto adoption for financial freedom. It is a fully remote company with a global team of industry pioneers spread across 70+ countries, operating with a strong crypto ethos and commitment to security and education.
Manage and support infrastructure for Growth teams, including Nomad, Hashistack, databases, and any other underlying systems
Maintain and troubleshoot GitLab CI pipelines, ensuring reliable and fast build, test, and deployment cycles
Provide operational support across Onboarding, Acquire, and Engage teams, helping debug issues in staging and production environments
Kraken is a mission-focused company rooted in crypto values, aiming to accelerate the global adoption of crypto, so that everyone can achieve financial freedom and inclusion. As a fully remote company, they have Krakenites in 70+ countries who speak over 50 languages.
Design, build, and maintain scalable, highly available and fault-tolerant infrastructures.
Implement and improve monitoring, alerting, and incident response systems to ensure optimal system performance and minimize downtime.
Drive continuous improvement in infrastructure automation, deployment, and orchestration.
Mistral AI is dedicated to democratizing AI through high-performance, optimized, open-source models, products, and solutions designed to integrate seamlessly into daily working life. They are a dynamic, collaborative team passionate about AI and its potential to transform society dedicated to innovation.
Deliver a scalable internal infrastructure platform on public cloud environments.
Establish and evolve Kubernetes-based platform capabilities to support high-availability, production-grade workloads at scale.
Build a secure and reliable foundation that supports CI/CD pipelines and minimizes operational risk across engineering teams
Chainlink is the industry-standard oracle platform bringing the capital markets onchain and powering the majority of decentralized finance (DeFi). Since inventing decentralized oracle networks, Chainlink has enabled tens of trillions in transaction value and now secures the vast majority of DeFi.
Lead a platform engineering team delivering managed Kubernetes and cloud infrastructure across multiple providers and deployment models.
Own the platform delivery roadmap, coordinating with Cloud Organization, Security, and Professional Services to manage dependencies.
Drive foundational infrastructure programs in private networking and cloud governance to establish Ditto's deployment baseline.
Ditto redefines data movement at the edge by providing a peer-to-peer sync engine for building resilient, real-time applications in any network condition. This venture-backed, globally distributed startup is trusted by major enterprises across aviation, retail, and defense, and is committed to building a diverse and inclusive team.
Lead and evolve cloud infrastructure and operational practices across a globally distributed SaaS platform.
Ensure the reliability, scalability, and efficiency of systems running across multiple AWS regions.
Partner with Engineering, Security, and Product to strengthen operational excellence and drive continuous improvement.
Firstup's mission is to improve the employee experience at every moment that matters, large and small. As the communication pipeline for the world's workforce, they serve 40 of the Fortune 100 companies, reaching and connecting more than 17 million employees daily.
Lead efforts to improve system reliability, scalability, and performance across critical services
Define and implement SLIs/SLOs and error budgets, and use them to guide engineering priorities
Design and develop observability systems (metrics, logging, tracing, alerting) that produce actionable alerts and data with minimal noise
UJET is an AI-powered contact center innovation company, delivering a cloud platform that redefines the customer experience. They are built on a cloud-native architecture and partner with businesses to deliver exceptional interactions and accelerated growth in the AI-driven world.
Leading a team focused on designing, building, and evolving cloud-native, containerized infrastructure.
Driving complex technical initiatives and ensuring the availability, security, scalability, and reliability of our data ecosystem.
Guiding and developing engineering talent, setting priorities, driving execution, and partnering across teams.
Pismo, founded in 2016, provides a comprehensive processing platform for banking, card issuing and financial market infrastructure. Pismo has 500+ employees located in more than 10 countries around the world and was acquired by Visa in 2024.
Build Self-Service Infrastructure: Design and scale highly available Infrastructure as Code (IaC) modules using Terraform. Empower development teams to provision resources autonomously and securely.
Champion Platform Reliability: Partner closely with engineering teams to define, measure, and operationalize SRE metrics. Balance feature velocity with system stability.
Elevate Developer Experience (DevEx): Architect frictionless, GitOps-driven CI/CD pipelines utilizing GitHub Actions and ArgoCD. Facilitate automated, secure, and progressive deployments.
KTO Group drives excitement in iGaming through innovation, focusing on transparency and player satisfaction. Founded in 2018, KTO blends sports betting with online casino entertainment on a proprietary platform, and is a rising leader in LATAM, ranked among Brazil’s top 10 iGaming brands.
Evolve ArgoCD GitOps standards across environments
Build reusable Terraform modules and practices for safe, repeatable cloud infrastructure provisioning and drift detection
Lead the operation and evolution of production-grade Kubernetes clusters across cloud environments
GitLab is the intelligent orchestration platform for DevSecOps. More than 50 million registered users and more than 50% of the Fortune 100 trust GitLab to ship better, more secure software faster.
Operating and evolving 100+ multi-cloud streaming clusters and related database infrastructure.
Diagnosing and eliminating cross-layer failure modes.
Designing safe upgrade and rollout strategies at scale.
Grafana Labs is a remote-first, open-source powerhouse with over 20M users of Grafana, its open source visualization tool. Grafana Labs helps more than 3,000 companies manage their observability strategies with the Grafana LGTM Stack, and its team thrives in an innovation-driven environment.
Architect and scale AWS infrastructure, including container orchestration and observability platform development.
Lead infrastructure builds for compliance (SOC 2, HIPAA) and harden container workloads across environments.
Own the shared infrastructure stack, CI/CD pipelines, and reliability practices including SLOs and incident response.
Truv is transforming the financial data industry with a secure, real-time API platform for payroll account access, streamlining income verification and direct deposit switching. It is a well-funded, innovative startup backed by top investors, with a leadership team from companies like Apple, Carta, and Venmo.
Design self-healing infrastructure and automated root-cause analysis workflows.
Drive the strategic roadmap for our GCP and Kubernetes-based cloud capabilities.
Transform CI/CD, deployment, and build tooling into a cohesive, self-service product.
Signifyd helps merchants confidently grow their businesses by building trusted relationships with their customers. They have thousands of leading merchants across more than 100 countries and securely process billions of transactions each year.
Develop and maintain observability solutions using platforms like Datadog, Prometheus and Grafana
Take a leading role in incident management, including coordinating response efforts, troubleshooting issues, and identifying follow-up actions
Partner with product engineering teams to architect reliable systems, recover from incidents, and learn from mistakes
Ditto is redefining how data moves at the edge, aiming to make resilient, real-time applications seamless for developers, regardless of network conditions. It's a globally distributed and fast-growing startup with over $145 million in funding that is committed to building a diverse and inclusive team.
Building tools and applications to extends Calendly’s infrastructure platform
Evaluating and deploying cloud native open source tools
Exercising expertise in cloud infrastructure concepts and patterns
Calendly's product powers connections for millions through impactful innovation. They are in the midst of exciting growth and desire people that want to learn, grow, and do their best work.
Building tools and applications to extends Calendly’s infrastructure platform
Evaluating and deploying cloud native open source tools
Exercising expertise in cloud infrastructure concepts and patterns
Calendly makes it possible for their customers through impactful innovation. They have millions of users and are in the midst of exciting product growth.
Lead the implementation and optimization of CI/CD pipelines.
Develop and maintain Infrastructure as Code (IaC) scripts to automate infrastructure provisioning and management.
Identify and implement automation opportunities to improve efficiency and reduce maxnual effort.
Pismo provides a comprehensive processing platform for banking, card issuing and financial market infrastructure and helps customers innovate and build the next generation of banking and payment solutions. Pismo joined Visa in 2024 and has 500+ employees located in more than 10 countries around the world.
Define and drive the roadmap for deployment, configuration, infrastructure, and operational tooling across cloud and on-premise environments.
Work closely with engineering, design, customer-facing teams, and customers to identify and resolve deployment friction.
Improve how enterprise customers install, configure, upgrade, secure, and operate Rasa in production.
Rasa is a leader in generative conversational AI, enabling enterprises to build and deliver next-level AI assistants. The company was founded in 2016 and is remote-first with a global presence.