Job Description
Design, build, and maintain ETL/ELT pipelines in Databricks to ingest, clean, and transform data from diverse product sources. Construct gold layer tables in the Lakehouse architecture that serve both machine learning model training and real-time APIs. Monitor data quality, lineage, and reliability using Databricks best practices.
Collaborate with AI/ML teams to ensure data is modeled and structured to support natural language prompts and semantic retrieval using 1st and 3rd party data sources, vector search and Unity Catalog metadata.
Work with backend engineers to design and implement serverless APIs (e.g., via AWS Lambda with TypeScript) that expose gold tables to frontend applications. Ensure APIs are performant, scalable, and designed with data security and compliance in mind.
About CME
We are a multinational technology consulting firm. We help companies and corporations scale their operations, achieve technology innovation, elevate their brand and transform their business model.