Source Job

US

  • Lead the design and implementation of high-performance data movement pipelines using NVIDIA NIXL across GPU, CPU, and storage tiers.
  • Architect and drive integration of DDN Infinia with GPU-accelerated inference platforms for large-scale, real-time AI workloads.
  • Own end-to-end optimization of I/O paths between GPU memory and storage using technologies such as NVIDIA GPUDirect Storage, RDMA, and NVMe-over-Fabrics.

Python C++

20 jobs similar to Senior Staff Engineer

Jobs ranked by similarity.

US

  • Design and implement scalable distributed systems that handle heavy CPU, disk, and network workloads.
  • Analyze system behavior to identify bottlenecks across compute, storage, and network layers.
  • Build instrumentation, metrics, and telemetry to measure system performance.

RapidFort is a Series A cybersecurity company backed by $42M from leading investors, building the next generation of container and software supply-chain security. Our platform helps enterprises and U.S. government agencies eliminate vulnerabilities in container images, secure Kubernetes environments, and protect cloud-native infrastructure at runtime.

US

  • Drive embedded quality engineering for our S3-compliant high-performance file system.
  • Collaborate with development teams and individuals worldwide to deeply understand DDN products.
  • Design and own the quality strategy covering correctness, performance, and reliability across the S3 stack.

DataDirect Networks (DDN) is a global market leader renowned for powering many of the world's most demanding AI data centers. DDN's cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data.

Global

  • Design and evolve multi-provider, multi-region GPU compute clusters optimized for large-scale training.
  • Serve as the primary technical point of contact for customers running large-scale training workloads.
  • Build production-grade automation for cluster provisioning, GPU health checks, job scheduling, self-healing, and firmware/driver lifecycle management.

Andromeda Cluster gives early-stage startups access to scaled AI infrastructure. They work with leading AI labs, data centers, and cloud providers to deliver compute when and where it’s needed most and are expanding to find the brightest in AI infrastructure, research and engineering.

US

  • Design and develop multi-threaded asynchronous replication systems with parallel streaming capabilities
  • Build object-level delta replication with checkpointing and resume functionality
  • Implement secure data transfer mechanisms using TLS 1.3 with mutual authentication

DataDirect Networks (DDN) is a global market leader renowned for powering many of the world's most demanding AI data centers, in industries ranging from life sciences and healthcare to financial services, autonomous cars, Government, academia, research and manufacturing. DDN's cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data.

US

  • Lead, coach, and develop a team of software engineers focusing on Lustre and data management capabilities.
  • Manage technical execution for software releases, including planning and quality assurance.
  • Drive architecture and design reviews for filesystem features and operational tooling.

We are currently looking for a Storage Engineering Manager - REMOTE. The ideal candidate will foster an environment of innovation and efficiency while managing the team's performance and development.

$141,000–$242,000/yr
Canada US Unlimited PTO

  • Act as a versatile engineering powerhouse, quickly ramping up on new domains and fluidly transitioning between backend pipelines, API design, and vehicle-side serving.
  • Be part of a multidisciplinary team of Engineers and Research Scientists using AI-first approach to create high-definition (HD) maps at scale.
  • Architect and build the engineering foundations of our long-term mapping infrastructure.

Waabi, founded by AI visionary Raquel Urtasun, is the leader in Physical AI. With a world-class team, they're unlocking the next era of autonomous transportation with technology that's powering commercial autonomous trucks and robotaxis.

US

  • Design, build, and maintain data products that support R&D, analytics, Lab, and scientific workflows.
  • Build and maintain data pipelines for large and complex datasets ensuring high data quality.
  • Partner with scientists and engineers to translate research needs into reusable data assets.

Natera is a global leader in cell-free DNA (cfDNA) testing, dedicated to oncology, women’s health, and organ health. They aim to make personalized genetic testing and diagnostics part of the standard of care to protect health and enable earlier and more targeted interventions that lead to longer, healthier lives.

$98,000–$150,000/yr
US

  • Lead a team of Data Center Engineers and all related activities in support of QA and R&D operations.
  • Oversee data center operations and its physical infrastructure, driving uptime, operational excellence, and team performance.
  • Manage capacity planning and resource allocation for power, cooling, and rack space to support current and future hardware deployments.

DDN Storage is a global market leader renowned for powering many of the world's most demanding AI data centers, in industries ranging from life sciences and healthcare to financial services, autonomous cars, Government, academia, research and manufacturing. They are committed to innovation, customer-centricity, and a team of passionate professionals.

Global

  • Architect the algorithms, execution strategies, and infrastructure that power solver competition, quote aggregation, and intent-ready execution workflows.
  • Lead deep code and design reviews, make high-impact architectural decisions, and uphold the highest engineering standards.
  • Work with product, research, and ecosystem partners to align Solver capabilities with the NEAR Intents roadmap.

Defuse Labs develops NEAR Intents to enable seamless cross-chain interactions in an automated world — connecting AI, services, and financial applications. With expertise in AI, cryptography, and decentralized finance, our team is redefining how intelligent agents interact across blockchain networks.

$205,000–$220,000/yr
US

  • Partner with Sales and Field Engineering to design and architect complex, enterprise-grade solutions tailored to customer needs.
  • Lead the implementation of custom solutions within customer environments across multi-cloud and hybrid architectures.
  • Optimize solutions for performance, scalability, and reliability in production environments.

Striim is a unified data integration and streaming platform that connects clouds, data, and applications. We believe and expect all of our employees to operate as one with unlimited potential and dignity.

Global

  • Own Technical Excellence: Define architecture, design patterns, and engineering standards for the feature store platform, ensuring code quality, reliability, and performance.
  • Lead V2 Implementation: Architect and execute the next generation of our feature store, building for scale, low-latency serving, and enterprise-grade reliability.
  • Guide Product Roadmap: Partner with Product and leadership to shape the technical roadmap, translating customer requirements and market trends into actionable engineering priorities.

Redis builds the product that runs the fast apps our world runs on. Our feature store enables organizations to manage, serve, and monitor features at scale, delivering a foundation for mission-critical machine learning workloads. We value curiosity, diversity of thought, and innovation, and we're committed to a diverse and inclusive work environment where all employees’ differences are celebrated and supported.

  • Responsible for deploying, configuring, managing and tuning the department’s storage systems at physical and virtual layers.
  • Deploy, configure and manage Storage Area Network Switches as required.
  • Shape the future direction of the department’s storage and data platforms.

The Department of Industry, Science and Resources provides ICT support and underpinning infrastructure services to the department and the Ministers’ Offices, ensuring all staff have access to stable and secure technology. The ICT Networks & Infrastructure Platforms section falls within the ICT Operations Branch.

US

  • Significantly impact the architecture and performance of a next-generation infrastructure.
  • Optimize distributed systems for real-time processing at scale.
  • Enhance the robustness and efficiency of core infrastructure components.

Jobgether is a platform that uses AI-powered matching process to ensure applications are reviewed quickly, objectively, and fairly against the role's core requirements. The system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company.

$195,000–$215,000/yr
US 12w maternity

  • Lead engineering teams to ship high-quality features and maintain the Grantmaker platform, owning timelines and quality.
  • Build and inspire a high-performing distributed engineering team, setting clear expectations and fostering a motivating environment.
  • Drive architectural consistency, cross-team collaboration, and operational excellence, shaping architecture and strategy.

Fluxx is a mission-driven business that operates a cloud platform, enabling the end-to-end grantmaking process for funders and doers. They are committed to building a team of outstanding individuals with diverse backgrounds and perspectives and do not specify employee numbers.

US

  • Build a high performing team by hiring and nurturing engineering talent.
  • Drive technical solutioning and building roadmaps.
  • Work closely with engineering leaders to drive engineering excellence in our processes and systems.

Aledade empowers primary care physicians with technology to keep their patients healthy, preventing unnecessary hospitalizations. They are a technology company that helps primary care doctors deliver better care at a lower cost.

$160,000–$185,000/yr
US

  • Lead strategic partnerships with GPU and AI ecosystem leaders.
  • Develop and execute joint go-to-market strategies.
  • Track semiconductor and AI industry trends.

Cologix is a leading North America network-neutral interconnection and hyperscale edge data center company. Their platform gives customers access to digital edge and Scalelogix hyperscale edge data centers. Backed by one of the largest North American infrastructure funds, Cologix's leadership team values their people, environment and clients.

US 4w PTO

  • Guide complex initiatives from initial requirements gathering and robust system design to deployment.
  • Build scalable data infrastructure and shape the developer experience.
  • Provide technical guidance to both engineering and research staff.

Voleon is a technology company that applies state-of-the-art AI and machine learning techniques to real-world problems in finance. They have become a multibillion-dollar asset manager, and they have ambitious goals for the future.

$225,000–$255,000/yr
US 4w PTO

  • Design and implement distributed scheduling and workflow systems.
  • Build scalable, reliable platform services and storage abstractions.
  • Improve system reliability, observability, and operational performance.

Voleon is a technology company that applies state-of-the-art AI and machine learning techniques to real-world problems in finance. They have become a multibillion-dollar asset manager, and they have ambitious goals for the future.

$250,000–$275,000/yr
US Unlimited PTO 2w maternity 2w paternity

  • Lead and support multiple engineering teams, lead very senior engineers, ensuring consistent delivery, strong technical quality, and healthy team operations.
  • Define and guide technical architecture and long-term system strategy across the teams within your scope and in collaboration with EMEA engineering leadership.
  • Own and manage technical roadmaps jointly with the VP of Product, VP of Engineering and Director of Engineering in EMEA.

CoderPad is on a mission to fix the technical interview process. We serve over 3,800 customers and have hosted more than 4 million technical interviews in 90+ programming languages since our launch in 2013 and it has become a leading interview platform.

Europe

  • Own and evolve ClickHouse's Python connector and SDK ecosystem.
  • Build and maintain integrations with orchestration platforms and transformation tools.
  • Drive the AI/LLM integration strategy.

ClickHouse is a real-time analytics company that leads the market in real-time analytics, data warehousing, observability, and AI workloads. They have over 3,000 customers with accelerating momentum, validated by a $400M Series D financing round.