Designing and operating always-on product environments for customer demos, internal use, and stakeholder access.
Building feature branch / preview environments to support UX and rapid feedback loops.
Integrating core system components across Fleet Management, Edge Management, OS, and related services.
Defense Unicorns delivers mission value by streamlining software delivery. They are composed of innovators, software engineers, and veterans with decades of experience delivering technology programs across the federal market.
Own and evolve CI/CD pipelines using GitHub Actions and OIDC-based authentication for microservices and agentic workloads.
Automate infrastructure provisioning using Infrastructure as Code tools such as Terraform and CloudFormation.
Operate and scale our Kubernetes platform, including autoscaling, ingress, and multi-tenant isolation for enterprise customers.
Zingtree is a next-generation intelligent process automation platform reimagining customer experience operations for enterprise support leaders. It is a small team with high ownership, emphasizing automation, collaboration, and transparency.
Own and evolve the cloud substrate including compute, EKS fleet, networking, and cloud operations across AWS and GCP.
Design and maintain the networking fabric connecting Webflow's services, ensuring reliability, security, and scalability.
Build and enforce guardrails around IAM and permissions to keep infrastructure secure and auditable while driving FinOps and cost optimization.
Webflow is building the world's leading AI-native Digital Experience Platform. As a remote-first company built on trust and creativity, it empowers over 2 million users globally to design, launch, and optimize for the web without barriers.
Design, develop, and maintain core infrastructure supporting large-scale optimization engines and planning workflows to improve scalability and performance.
Analyze and optimize performance bottlenecks in optimization pipelines, focusing on compute, memory usage, and data flow for complex planning problems.
Contribute to evolving platform architecture, designing systems for large datasets and parallel execution while ensuring enterprise-grade reliability and maintainability.
Kinaxis is a global leader in modern supply chain orchestration, providing an AI-powered platform for end-to-end supply chain transparency and faster decision-making. The company has over 2000 employees globally, is a multi-time Top Employer award winner, and fosters a culture of innovation with a serious focus on technology, customers, and a collaborative, not-too-serious internal environment.
Design, develop, test, and deploy secure production systems across cloud-native and edge appliance deployments.
Embed with cross-functional teams to advise on infrastructure and security best practices that perform in production.
Own end-to-end infrastructure outcomes for critical programs and harden the artifact pipeline for consistent builds across all deployments.
Onebrief develops collaboration and AI-powered workflow software specifically designed for military staffs to enhance their efficiency and effectiveness. The company is fully remote, employs a team of veterans and technologists, and is valued at $2.15B with significant funding from top-tier investors, fostering a culture of ownership, excellence, and serious teamwork.
Maintain and optimize AWS EC2 and EKS clusters to ensure high availability and performance.
Lead troubleshooting of production outages, providing timely resolution and root cause analysis.
Implement and improve CI/CD pipelines using tools like Jenkins and GitHub Actions to streamline deployment processes.
CI&T are tech transformation specialists uniting human expertise with AI to create scalable tech solutions. With over 8,000 CI&Ters globally, they have built partnerships with more than 1,000 clients over 30 years, and Artificial Intelligence is deeply embedded in their work reality.
Own and operate GPU and accelerator clusters for AI training, inference, and experimentation, ensuring reliability and cost-efficiency.
Build and optimize scheduling, orchestration, and serving systems using frameworks like vLLM and Triton to improve latency, throughput, and memory efficiency.
Partner with ML engineers to remove workflow bottlenecks and build observability for GPU utilization, capacity, and incident response.
Kraken is a crypto exchange platform building premium financial products for traders and institutions, accelerating global crypto adoption. It is a mission-driven, fully remote company with a world-class team of crypto experts spread across more than 70 countries.
Support the Platform Infrastructure by managing container environments on EKS, implementing GitOps workflows, and maintaining CI/CD pipelines.
Build for Reliability by defining SLIs/SLOs, leading incident response, and contributing to disaster recovery planning.
Drive Observability by designing and maintaining monitoring and logging stacks with Datadog, Sentry, and CloudWatch.
Turquoise Health is a Series C price transparency platform for finance leaders across healthcare, building the infrastructure for a more open, efficient healthcare marketplace. The company is a remote-first, US-based team of over 300 enterprise organizations that values transparency, empathy, inclusivity, creativity, and ownership.
Own and scale cloud infrastructure on AWS, managing Kubernetes clusters and container orchestration end-to-end.
Build and maintain CI/CD pipelines, implement monitoring and observability stacks, and improve system reliability and security.
Automate infrastructure with IaC tools, debug complex distributed systems, and participate in design reviews to raise the infrastructure bar.
Bespoke Labs is an AI research and data company building the datasets, benchmarks, and evaluation infrastructure that power frontier AI models. It is a small, fast-moving team backed by leading investors, trusted by top AI labs, and publishes research at leading conferences.
Explore infrastructure architecture, assess risks, build and maintain the platform roadmap to prioritize critical work.
Lead cost management and optimization efforts across infrastructure, including high proxy costs, while maintaining performance and security.
Ensure infrastructure reliability, scalability, and security through robust monitoring, alerting, and disaster recovery strategies, and lead the consolidation of two infrastructure stacks into one unified platform.
saas.group is a portfolio company that acquires and scales B2B SaaS businesses, accelerating their growth and potential. With a fully remote team of nearly 400 employees across 50+ countries, it fosters a collaborative, innovative, and dynamic culture focused on building the world's largest platform of independent SaaS companies.