Own performance optimization and reliability of large-scale GPU clusters and InfiniBand networking for HPC workloads.
Diagnose and resolve complex system-level issues across GPU, network, and compute layers, integrating new hardware components.
Develop automation for monitoring, fault detection, and proactive remediation in distributed compute environments.
Our partner is building a next-generation AI cloud infrastructure environment, focusing on large-scale high-performance computing systems. They foster a highly technical engineering culture with experts across systems, networking, and virtualization, offering career development and continuous learning opportunities.
Operate and maintain large-scale Linux environments (bare metal, clusters, cloud) and monitor system health to ensure high availability.
Help scale clusters toward hundreds to thousands of nodes, improving performance, reliability, and resource utilization.
Automate operational tasks using Python, Bash, Ansible, or Terraform and contribute to system design and architecture decisions.
Mistral AI builds high-performance, open, and efficient AI systems to power next-generation applications. We are a collaborative, low-ego, and highly technical team operating across Europe, the US, and beyond, scaling rapidly to support thousands of nodes.
Design and implement cloud infrastructure on GCP using infrastructure as code.
Manage cloud networking, compute resources, and CI/CD pipelines for reliable deployments.
Implement security, observability, and compliance controls in a regulated research environment.
RefinedScience advances care by integrating clinical and biological data with expert knowledge to improve clinical trial outcomes. They are a small to mid-size company that values acting with purpose, curiosity, ownership, relationships, and agility.
Lead end-to-end technical delivery for client-facing scientific and AI-driven projects.
Design, build, and deploy scalable software systems that extend or wrap scientific models.
Act as a technical liaison between researchers, product teams, and engineering stakeholders.
This role sits at the intersection of advanced AI systems, scientific computing, and real-world drug discovery applications, where engineering directly enables breakthroughs in life sciences. The position requires strong autonomy and the ability to operate in highly ambiguous, research-driven environments where experimentation and execution go hand in hand.
Design, implement, and support critical high-transactional systems with direct customer, revenue, or compliance impact.
Analyze and resolve system problems, identify root causes, and execute remediation plans.
Mentor less experienced engineers and contribute to compliance activities such as reporting and verification.
Mercury Insurance helps people reduce risk and overcome unexpected events. It is a midsize company with a culture that values team growth, diversity, and inclusion.
Develop and maintain scalable automation and integrations across cloud platforms and services.
Design, implement, and operate CI/CD pipelines using Jenkins, Dagger, Terraform, and Docker.
Build, operate, and troubleshoot workloads on Kubernetes, using Kustomize and Helm.
People Inc. is America’s largest digital and print publisher. Our brands harness the best intent-driven content, the fastest sites, and the fewest ads to help nearly 200 million people every month make decisions.