Own reliability, latency, and performance for AI platform services and data infrastructure on AWS.
Design and maintain CI/CD pipelines, infrastructure-as-code, and observability frameworks across the stack.
Partner with AI and data engineers to ensure secure, cost-optimized, and scalable deployment of platform components.
HHAeXchange is the leading technology platform for home and community-based care, providing an end-to-end homecare solution for people who are aging or have disabilities. Founded in 2008, the company is passionate about transforming healthcare by connecting patients, providers, managed care organizations, and states.
Design and maintain scalable ML infrastructure including data pipelines, training workflows, and model deployment systems.
Own end-to-end ML lifecycle operations, ensuring reliable delivery of models into production at scale.
Implement monitoring, telemetry, and feedback loops for ML models running across large-scale device fleets.
Our partner company develops ML systems for connected hardware products used by customers worldwide. They operate in a fast-paced, product-driven environment with a collaborative and technically ambitious culture focused on real-world ML impact.
Own the ML serving API and deploy models to production with CI/CD and infrastructure as code.
Build monitoring, alerting, and reliability for NBA models and LLM agents.
Drive architectural decisions and mentor engineers on MLOps patterns.
Clutch is a vertical SaaS company backed by Andreessen Horowitz, revolutionizing how credit unions engage with members via fintech lending software. The company is small and ambitious, with a lean data team of five that values pragmatism and fast shipping.
Build and operate the ML lifecycle platform, including tooling for experiment tracking, model registry, and versioned pipelines.
Own CI/CD and deployment for ML workloads, building automated pipelines from notebook to production.
Make models observable and reliable in production with monitoring for latency, drift, data quality, and cost signals.
dv01 provides a data analytics platform for the structured finance market, offering transparency into investment performance and risk for lenders and Wall Street investors. With over 400 clients and coverage of over 100 million loans, dv01 is a data-first company with a diverse and innovative culture.
Own and scale AI compute and deployment platforms including Kubernetes and GitOps pipelines.
Build inference infrastructure and observability stacks for LLM-powered workflows.
Drive security, compliance, and governance at the systems level in a regulated healthcare environment.
Hims & Hers is a leading health and wellness platform focused on making healthcare accessible and personal. As a publicly traded company on the NYSE (HIMS), it offers flexible/remote work and a culture centered on innovation and employee well-being.
Collaborate with data scientists and engineers to build scalable ML pipelines, troubleshoot infrastructure issues from Linux to Kubernetes, and optimize model performance.
Drive high engineering standards, design on-premises MLOps solutions, and maintain tools for deployment and monitoring.
Refine CI/CD workflows, incorporate ML model training and evaluation into testing, and ensure seamless handover between research and production.
Learneo is a platform of builder-driven businesses, including Course Hero, CliffsNotes, LitCharts, Quillbot, Symbolab, and Scribbr, focused on supercharging productivity and learning. The company supports high-growth businesses with centralized corporate operations and has a virtual-first culture with employees across multiple countries.
Build and maintain infrastructure platforms for over 200 backend services running on Kubernetes clusters with 40,000+ cores.
Lead and mentor other engineers, own complex infrastructure failures, and participate in a shared on-call rotation.
Drive cloud cost efficiency, estimate schedules, and use AI tools as a first-class collaborator in daily workflows.
Life360's mission is to keep people close to the ones they love through location sharing, safe driver reports, and crash detection. The company serves approximately 97.8 million monthly active users across more than 180 countries and has more than 500 remote-first employees.
Design and operate core AI platform components for training, deploying, and serving ML models at scale.
Own model serving and inference workflows end-to-end, optimizing for reliability, latency, throughput, and cost.
Collaborate with product, infrastructure, and security teams to build scalable platform capabilities for AI-powered features.
Mozilla Corporation is the non-profit-backed technology company behind Firefox and Pocket, with over 225 million monthly users. A wholly-owned subsidiary of the Mozilla Foundation, the company is mission-driven, employee-owned, and focused on privacy and open standards.
Take ownership of incident management and operational excellence across cloud infrastructure.
Automate high-risk manual processes and drive reliability gains through engineering.
Own a platform domain such as Temporal, observability, or Kubernetes operations.
Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London with offices across Europe and the US, and has over $530 million in funding from premier investors like Accel and Nvidia's VC arm.
Define and document enterprise AI use cases, business value drivers, and target delivery models aligned with organizational goals.
Develop and maintain current-state and target-state AI architecture across the enterprise, including platforms, data flows, integration patterns, security controls, and governance.
Lead build-versus-buy evaluations for AI platforms and services, establish reusable architecture patterns, and guide proof-of-concept strategies.
ISC2 is the world's leading nonprofit member organization for cybersecurity professionals, dedicated to a safe and secure cyber world. With a globally recognized portfolio of certifications and a charitable arm, the organization fosters an inclusive culture built on integrity, advocacy, commitment, inclusion, and excellence.
Design and implement scalable infrastructure solutions using AWS services and Infrastructure as Code.
Build robust CI/CD pipelines with comprehensive testing and automated deployment capabilities.
Develop monitoring and alerting systems using CloudWatch, Splunk, and modern observability tools.
U.S. FinTech built and operates the largest and most advanced mortgage securitization platform in the world, supporting Fannie Mae and Freddie Mac. The company supports 70% of the market with a cloud-based platform and a team that combines financial expertise with technological innovation.
Build and operate production-grade model serving infrastructure using vLLM, TGI, or Triton frameworks.
Design and implement auto-scaling, multi-model architectures, and intelligent request routing for ML inference.
Optimize GPU utilization, memory efficiency, and observability to ensure low-latency, cost-effective systems.
They are a distributed cloud infrastructure startup building AI-native cloud services with GPU-powered compute. The company is well-funded, fast-scaling, and operates in a remote-first environment with a focus on sustainability and decentralization.
Design, build, and maintain CI/CD pipelines and Infrastructure as Code using tools like CloudFormation, Ansible, and Terraform.
Monitor and respond to infrastructure and application health, troubleshoot operational issues, and provide on-call support.
Maintain operational documentation, communicate proactively with teams, and ensure service delivery meets client expectations.
NICE Ltd. provides software used by 25,000+ global businesses, including 85 of the Fortune 100, to deliver customer experiences, fight financial crime, and ensure public safety. With over 8,500 employees across 30+ countries, NICE is recognized as a market leader in AI, cloud, and digital innovation.
Lead design and operation of internal developer platforms and self-service infrastructure.
Build and optimize CI/CD pipelines, deployment workflows, and automation across GitHub Actions, Jenkins, ArgoCD.
Apply SRE principles to improve developer-facing systems and software delivery performance.
Versant is a media company owning iconic brands in news, sports, and entertainment, including USA Network, Fandango, and Rotten Tomatoes. It is an independent, publicly traded company with a collaborative, inclusive culture and a remote-first work environment.
Build and operate the real-time inference service for the risk decision engine with low latency and high availability.
Own model deployment infrastructure including CI/CD, shadow mode, and staged rollouts.
Build model observability and partner with Risk Data Science for production operation.
Mercury is a fintech company that provides banking services for startups via partner banks. The company is committed to creating a safe environment and values diversity, with a growing team focused on innovation.
Own and evolve observability strategy including monitoring, alerting, dashboards, logging, and distributed tracing.
Define and manage SLIs, SLOs, and reliability metrics, improving MTTD and MTTR through automation.
Build and maintain reliable cloud infrastructure on AWS and Kubernetes while mentoring engineers on SRE best practices.
Filevine is a Legal AI company delivering Legal Operating Intelligence for legal work. Fueled by a team of exceptional collaborators and innovators, Filevine’s rapid growth has earned AI awards and recognition from Deloitte and Inc. as one of the most innovative and fastest-growing technology companies in the country.
Collaborate with data scientists and software engineers to build scalable data pipelines and ML deployment systems.
Troubleshoot issues across the ML infrastructure stack, from Linux and Docker to Kubernetes and model serving.
Drive high engineering standards through code reviews, testing, and CI/CD enhancements.
Quillbot helps students and professionals strengthen their writing with AI-powered tools. We serve over 56 million users globally and foster a collaborative, virtual-first culture.
Design and develop CI/CD systems for websites, services, and release workflows, and operate an EKS-based Kubernetes platform.
Diagnose debug production incidents, drive root-cause analysis, and implement improvements to enhance system reliability.
Write and maintain infrastructure as code using Pulumi or Terraform/OpenTofu across multiple AWS accounts with security-conscious practices.
Thunderbird is one of the world’s most trusted open-source email applications, empowering more than 20 million people globally. Our small but growing distributed team includes 65+ people across seven countries, and we build privacy-respecting communication tools with a collaborative, inclusive, and user-first spirit.
Assess current pipelines and data architecture to produce a prioritized plan for change.
Design durable data and ML systems grounded in customer needs with documented tradeoffs.
Harden pipelines, upgrade data architecture, and raise standards for observability and reliability.
FutureFit AI's core mission is to help more people get to better jobs faster and cheaper, with a focus on those facing barriers to opportunity. Their team of 30-50 across the US and Canada fosters a high trust, high intensity culture with a will to win.