Job Description
Seeking a highly skilled Machine Learning Engineer to join our advanced model development team. This role focuses on pre-training, continued training, and post-training of models, with a particular emphasis on draft model optimization for speculative decoding and quantization-aware training (QAT) . The ideal candidate has deep experience with training methodologies, open-weight models, and performance-tuning for inference.
Lead pre-training and post-training efforts for draft models tailored to speculative decoding architectures. Conduct continued training and post-training of open-weight models for non-draft (standard) inference scenarios. Implement and optimize quantization-aware training pipelines to enable low-precision inference with minimal accuracy loss. Collaborate with model architecture, inference, and systems teams to evaluate model readiness across training and deployment stages. Develop tooling and evaluation metrics for training effectiveness, draft model fidelity, and speculative hit-rate optimization. Contribute to experimental designs for novel training regimes and speculative decoding strategies.
About Groq
Groq delivers fast, efficient AI inference, and its LPU-based system powers GroqCloud™, giving businesses and developers the speed and scale they need.