Job Description

This role is responsible for all aspects of data collection to support our model training operations. We are able to build high-quality datasets at petabyte-scale and low cost through a tight integration of infrastructure, engineering, and research work. You will be scrappy to find new sources of audio data and bring it into our ingestion pipeline. Operate and extend the cloud infrastructure for our ingestion pipeline, currently running on GCP and managed with Terraform. Collaborate closely with our Scientists to shift the cost/throughput/quality frontier, delivering richer data at bigger scale and lower cost to power our next-generation models.

About Speechify

Speechify’s text-to-speech products turn PDFs, books, Google Docs, news articles, websites into audio, helping 50 million people read faster and remember more.

Apply for This Position

Remote regions

Benefits

Job Description

About Speechify