Similar Jobs

See all

Responsibilities:

  • Own cross-cloud data movement and delivery, including executing and monitoring large-scale transfers across AWS S3, Google Cloud Storage, Azure Blob, Snowflake, and customer environments using CLI tools.
  • Build structured data assembly and lightweight transformation workflows, using Python and SQL to join datasets, clean data, add derived columns, and validate CSV, Parquet, and database tables.
  • Operate internal pipelines with production discipline, leveraging Protege's Dagster-based platform to orchestrate data processing, maintain separation between workflows, and build scripts for filtering, manifest generation, validation, and recovery.

Success Timeline:

  • In 30 days, build working knowledge of delivery patterns, environments, permissions models, and core tooling by shadowing live deliveries and becoming operational on standard workflows.
  • In 60 days, own scoped production work, running cross-cloud deliveries and light transformations with limited support, and improve runbooks, validation steps, and status communication.
  • In 90 days, independently own complex delivery workflows across multiple systems while reducing rework and surfacing platform improvements to eliminate operational risk and manual toil.

Team Culture:

  • Protege moves fast thoughtfully, with a bias toward action and continuous learning, operating as a lean, high-trust team where clarity and autonomy drive work.
  • The team takes work seriously but not themselves, solving hard problems with humility and celebrating wins, while being kind, direct, and inclusive with frequent feedback for growth.
  • Everyone is a hands-on builder focused on creating momentum, surrounded by people who care about impact, challenge thinking, and are excited about future developments.

Protege

Protege builds a secure platform for the efficient and privacy-centric exchange of AI training data, addressing a major challenge in AI development. It is a lean, high-trust team of builders backed by top investors, focused on velocity, impact, and shaping the future of data and AI.

Apply for This Position