Similar Jobs

See all

The Opportunity:

  • Modern AI workloads represent compute-intensive frontiers where our hardware's energy efficiency provides advantages.
  • We are seeking a Research Engineer to push boundaries in AI model capability, quality, and efficiency.

Key Responsibilities:

  • Algorithmic Acceleration: Research and implement techniques like quantization, sparsity, distillation, speculative decoding, and caching.
  • Hardware Co-Design: Partner with hardware and compiler teams to ensure algorithmic improvements translate to silicon gains.
  • Evaluation: Build profiling tools and benchmarking frameworks to measure model quality and efficiency metrics.

Qualifications:

  • 5+ years experience in ML research, applied ML, or ML systems.
  • Strong fundamentals in Python and PyTorch.
  • Hands-on experience with transformers, diffusion models, and fine-tuning large models.

Nice to Have:

  • Experience with efficient inference techniques such as KV cache optimization and MoE routing.
  • Background in hardware-aware ML optimization or quantization.

EnCharge AI

EnCharge AI is building the next generation AI platform using novel in-memory-computing architecture. The team consists of experienced AI researchers, silicon & systems engineers, and architects backed by leading investors.

Apply for This Position