Lead efforts to improve system reliability, scalability, and performance across critical services
Define and implement SLIs/SLOs and error budgets, and use them to guide engineering priorities
Design and develop observability systems (metrics, logging, tracing, alerting) that produce actionable alerts and data with minimal noise
UJET is an AI-powered contact center innovation company, delivering a cloud platform that redefines the customer experience. They are built on a cloud-native architecture and partner with businesses to deliver exceptional interactions and accelerated growth in the AI-driven world.
Arista Networks is a data-driven, client-to-cloud networking company for large data center, campus, and routing environments. They have over $8 billion in revenue and value diversity of thought and perspectives, fostering an inclusive environment for creativity and innovation.
Own the architecture, development, and operation of scalable, secure, and fault-tolerant cloud services.
Drive technical design and architectural decisions for distributed systems, influencing patterns, standards, and long-term platform evolution.
Lead complex initiatives end-to-end, from design through deployment and ongoing optimization.
ExtraHop is a company focused on reinventing Network Detection and Response (NDR) to offer enterprises unparalleled visibility, context, and control against emerging threats. They integrate NDR with Network Performance Management (NPM), Intrusion Detection Systems (IDS), and forensics, providing a single, comprehensive solution.
Work with other Engineering teams to design sustainable infrastructure and microservice solutions.
Automate tools and infrastructure to reduce manual work.
Monitor applications and participate in an on-call rotation as required.
Bloomreach is building the world’s premier agentic platform for personalization, revolutionizing how businesses connect with their customers by building and deploying AI agents to personalize the entire customer journey. They power personalization for more than 1,400 global brands.
Design, build, and maintain scalable, highly available and fault-tolerant infrastructures.
Implement and improve monitoring, alerting, and incident response systems to ensure optimal system performance and minimize downtime.
Drive continuous improvement in infrastructure automation, deployment, and orchestration.
Mistral AI is dedicated to democratizing AI through high-performance, optimized, open-source models, products, and solutions designed to integrate seamlessly into daily working life. They are a dynamic, collaborative team passionate about AI and its potential to transform society dedicated to innovation.
Own end-to-end availability and performance of critical services, including building automation to prevent recurring issues
Administer Linux and Windows systems across web, application, and database servers
Develop and automate solutions using various programming languages
Coupa provides a total spend management platform for businesses. They utilize AI and a global network of buyers and suppliers. The company values collaboration, teamwork, transparency, and a commitment to excellence.
Lead the design and implementation of scalable, secure, and resilient cloud infrastructure across AWS and Azure.
Drive the architectural vision and strategy, ensuring alignment with long-term business goals.
Take the lead on automating and accelerating SDLC processes by identifying bottlenecks.
Candidly flips the script on planning, borrowing, repaying, and saving for college and is a category leader with an AI-driven student debt and savings optimization platform. They partner with hundreds of top employers and have a fully remote, international team of 70+ including alumni from Google, UBS, and Twitter.
Ensure the availability, reliability, performance, and security of our SaaS platform
Lead infrastructure automation efforts using Infrastructure as Code and Configuration Management tools
Define and monitor SLAs/SLOs/SLIs, and drive service quality improvements
Remote People builds the infrastructure to power borderless teams. Their technology enables businesses to hire anyone anywhere compliantly at the push of a button. They are committed to building a global, diverse team representing different and varied backgrounds, perspectives, and experiences.
Design and implement infrastructure and tools that empower our product teams to rapidly and securely iterate, emphasizing reliability and automation.
Influence the strategic direction of our infrastructure and operational practices, ensuring that we are well-positioned to scale and support our growing organization.
Take a proactive role in the resolution of production issues, ensuring that we are well-prepared to handle incidents and that we learn from them in a blameless manner.
SSV Labs is the core team behind the SSV Network - pioneering decentralized infrastructure for Ethereum staking. They are building tools, protocols, and standards to make staking more secure, scalable, and trustless.
Maintain, optimize, and enhance on-premises and cloud computing environments.
Execute technical aspects of implementation projects, ensuring seamless software integration and customization.
Automate Infrastructure-as-Code (IaC) to manage virtual machines and deploy containers, services, and other infrastructure.
Striveworks helps organizations harness AI to solve national security and business challenges, acting as a command center for data and models. Founded by data scientists and engineers, they aim to simplify the deployment and optimization of AI systems, ensuring reliability and scalability.
Support the availability and durability of critical services across production environments.
Develop automation for common operational tasks, reducing manual intervention and toil.
Partner with engineering, product, and operations teams to support resilient system design and operations.
Backblaze is the object storage leader in the open cloud movement, fueling customer success with cloud storage built purposefully to unlock budgets and unleash innovators. Founded in 2007, they scaled the business with less than $3 million in outside funding until 2021, and generate over $100m in revenue managing over three billion gigabytes of data storage for 500K+ customers in 175+ countries.
Design self-healing infrastructure and automated root-cause analysis workflows.
Drive the strategic roadmap for our GCP and Kubernetes-based cloud capabilities.
Transform CI/CD, deployment, and build tooling into a cohesive, self-service product.
Signifyd helps merchants confidently grow their businesses by building trusted relationships with their customers. They have thousands of leading merchants across more than 100 countries and securely process billions of transactions each year.
Implementing the improvements to the reliability, fault tolerance, scalability, and performance of our infrastructure
Managing incidents using your technical know-how to involve the appropriate teams and automate away manual practices
Improving observability across our systems (metrics, logs, tracing) to reduce time to detection and resolution
Newton is changing how Canadians trade crypto with the goal to make financial freedom achievable for everyone by giving their customers the tools and knowledge needed to navigate the crypto world. They are a remote team spread across Canada that values pushing boundaries and getting things done.
Build the foundational, reusable services that every other JumpCloud product relies on to function securely and efficiently.
Deepen your expertise in Go, AWS, and Kubernetes while gaining broad architectural exposure by adapting to different teams and tech challenges.
Perfect for a versatile engineer who loves solving core infrastructure problems, building common frameworks, and thrives in a dynamic, flexible environment.
JumpCloud delivers a unified open directory platform that makes it easy to securely manage identities, devices, and access across your organization. With JumpCloud, IT teams and MSPs enable users to work securely from anywhere and manage their Windows, Apple, Linux, and Android devices from a single platform.
Contribute to planning and implementation to enhance stability and reliability.
Apply software engineering and SRE principles to create tools and automation.
Continuously improve, evolve and innovate the observability of Commerce’s platform architecture.
Commerce empowers businesses to innovate, grow, and thrive with their open, AI-driven commerce ecosystem. Recognized as one of the Best Places to Work in Austin, San Francisco and Australia, we are a team of bold builders, sharp thinkers, and technical trailblazers.
Help define and drive the technical direction of our Cloud Infrastructure team within Platform Engineering.
Work across Valon’s production systems—compute, databases, storage, and networking—shaping the infrastructure foundations that every product and team depends on.
Set the technical direction for how we meet those challenges.
Valon is building the AI-native operating system for regulated finance, starting with mortgage servicing. We're a Series C company backed by a16z, transforming industries that others have written off as too complex to innovate.
Help to discover and triage vulnerabilities from various sources.
Design, configure, deploy, and maintain secure configurations across JUMO’s cloud and endpoint estate.
Work with engineering teams to complete threat modeling exercises.
JUMO is dedicated to financial inclusion and operates with a remote-first approach. They foster innovation and enable collaboration, valuing online facetime for collaboration at JUMO.
Design, build, and operate core cloud infrastructure across compute, storage, databases, and networking layers.
Own and improve the reliability, scalability, and security of Valon’s production systems as we scale to support major enterprise deployments.
Evaluate, adopt, and operationalize new infrastructure technologies (e.g., Vitess, Clickhouse, Redis) to meet evolving product and scale requirements.
Valon is building the AI-native operating system for regulated finance, starting with mortgage servicing. They are a Series C company backed by a16z, transforming industries that others have written off as too complex to innovate.
Building tools and applications to extends Calendly’s infrastructure platform
Evaluating and deploying cloud native open source tools
Exercising expertise in cloud infrastructure concepts and patterns
Calendly's product powers connections for millions through impactful innovation. They are in the midst of exciting growth and desire people that want to learn, grow, and do their best work.
Act as a primary or escalation responder in a 24x7 on‑call rotation
Automate repetitive operational tasks to reduce manual toil
Support and troubleshoot: Linux‑based systems Cloud platforms (AWS, Azure, GCP)
NiCE Ltd. software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. NiCE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries and recognized as an innovation powerhouse that excels in AI, cloud and digital.