Lead research efforts on generative video and audio models (ex: text-to-speech, speech-to-speech, audio-to-expression and other speech and multimodal AI topics). Work with the Applied ML team to help productionize our research, staying relevant with the latest advancements and innovating rapidly through prototyping. The ideal candidate works well in startup environments, is comfortable prioritizing for themselves, and is always down to take calculated risks.
They should have experience training deep learning models, building streaming text-to-speech models or speech-to-speech models, and possess strong foundations in audio modeling.