Job Description

You will lead the development and management of high-quality data labeling pipelines that support our large language models. This role involves building an internal labeling team, working closely with vendors, and designing scalable processes for data annotation. While the position does not include customer-facing responsibilities, your work will be critical to the success of our AI models, ensuring that they are trained on top-tier labeled data using crowdsourcing and other data collection techniques.

To build and optimize scalable data labeling pipelines that power the success of our machine learning models. You will design, develop, and implement scalable data labeling pipelines that integrate into model training workflows. You will also manage and expand the internal data labeling team to meet the company's growing needs. Furthermore, you will collaborate with external vendors to source and manage crowdsourced data labeling efforts, ensuring timely and high-quality delivery. You will monitor and improve labeling processes by conducting experiments. You will also set up metrics and QA processes to evaluate the quality of labeled data. You will work cross-functionally with researchers and engineers to align labeling pipelines with model training needs. You will identify new tools and technologies to streamline labeling processes.

About Poolside

poolside exists to be this company - to build a world where AI will be the engine behind economically valuable work and scientific progress.

Apply for This Position