Senior Storage Infrastructure Engineer

Lightning AI

Remote regions

US

Salary range

$180,000–$200,000/yr

Benefits

Storage Systems & Infrastructure:

  • Operate and scale distributed storage systems like VAST and S3-compatible object storage.
  • Improve performance and reliability for large-scale AI/ML training and inference workloads.
  • Troubleshoot complex storage and data path issues across hardware and software layers.

Automation & Tooling:

  • Build and maintain Python-based automation for provisioning and monitoring storage.
  • Develop tools to reduce manual operational overhead and improve lifecycle management.
  • Enhance workflows for deployment, maintenance, and scaling of storage clusters.

Systems & Operations:

  • Manage Linux-based systems in production bare-metal environments.
  • Partner with data center teams on hardware bring-up, upgrades, and issue resolution.
  • Support capacity planning and utilize monitoring for performance tuning.

Cross-Functional Collaboration:

  • Work with Infrastructure and Platform teams to integrate storage into the broader platform.
  • Contribute to design discussions for new infrastructure deployments and scaling strategies.
  • Help define best practices for storage in high-performance computing environments.

Lightning AI

Lightning AI builds an end-to-end platform for developing, training, and deploying AI systems, designed to take ideas from research to production. It operates globally with a focus on speed, focus, balance, craftsmanship, and minimalism, backed by top-tier investors.

Apply for This Position