Similar Jobs

See all

Senior Data Engineer

Abusix

North America

Typescript Javascript Node.js

Software Engineer, Data

PENN Entertainment, Inc.

US

Python SQL AWS

Staff Data Engineer

Brightwheel

US

Python AWS Kafka

Core Data Pipelines:

Design, implement, and maintain distributed ingestion pipelines for structured and unstructured data.
Build scalable ETL/ELT workflows to transform, validate, and enrich datasets for AI/ML model training and analytics.

Distributed Systems & Storage:

Architect pipelines across cloud object storage, data lakes, and metadata catalogs.
Optimize large-scale processing with distributed frameworks.
Implement partitioning, sharding, caching strategies, and observability for reliable pipelines.

Pretrain Data Processing:

Support preprocessing of unstructured assets for training pipelines, including format conversion, normalization, augmentation, and metadata extraction.
Implement validation and quality checks to ensure datasets meet ML training requirements.

Meshy

Meshy is a leading 3D generative AI company transforming content creation by enabling the creation of 3D models from text and images. They have a global team distributed across North America, Asia, and Oceania and are backed by venture capital firms like Sequoia and GGV, with $52 Million in funding.

Apply for This Position