Modern AI workloads represent compute-intensive frontiers where our hardware's energy efficiency provides advantages.
We are seeking a Research Engineer to push boundaries in AI model capability, quality, and efficiency.

Key Responsibilities:

Algorithmic Acceleration: Research and implement techniques like quantization, sparsity, distillation, speculative decoding, and caching.
Hardware Co-Design: Partner with hardware and compiler teams to ensure algorithmic improvements translate to silicon gains.
Evaluation: Build profiling tools and benchmarking frameworks to measure model quality and efficiency metrics.

Qualifications:

5+ years experience in ML research, applied ML, or ML systems.
Strong fundamentals in Python and PyTorch.
Hands-on experience with transformers, diffusion models, and fine-tuning large models.

Nice to Have:

Experience with efficient inference techniques such as KV cache optimization and MoE routing.
Background in hardware-aware ML optimization or quantization.

EnCharge AI

EnCharge AI is building the next generation AI platform using novel in-memory-computing architecture. The team consists of experienced AI researchers, silicon & systems engineers, and architects backed by leading investors.

Apply for This Position