Job Description
We are seeking an experienced Data Engineer with deep expertise in data transformation at scale, particularly in integrating and processing data from third-party public APIs. This role is critical to enhancing and maintaining data pipelines that feed into Natural Language Processing (NLP) models. The Data Engineer will design, build, and optimize scalable ETL/ELT data pipelines using Apache Spark, Apache Kafka, and orchestration tools such as Prefect or Airflow. The Engineer will also integrate external data sources and public APIs with internal data systems and work with large-scale datasets to support NLP model training and inference. Analyze existing pipelines and recommend enhancements for performance, reliability, and scalability and Collaborate with cross-functional teams, including data scientists and ML engineers to own the end-to-end engineering processβfrom planning and technical design to implementation. It is expected the engineer Regularly reports progress and outcomes to client stakeholders.
About Gigster
Gigster's clients rely on their Network for two main areas, Software Development and Cloud Services.