Job Description
Design, develop, and maintain data pipelines that ingest, transform, and load data from various sources into data lakes, data warehouses, and data marts using Python, Spark and other cloud services to serve analytics needs and machine learning model requirements. Contribute to the detailed design and architecture of the data platform ensuring consistency, efficiency and reusability of data components and processes. Perform data cleansing and validation using SQL/PySpark to remove or correct erroneous, incomplete, or inconsistent data. Apply data transformations and implement business logic using SOL, Python and Spark to enhance, enrich, or standardize data. Handle large-scale and complex data sets using distributed systems and parallel computing to improve performance of the pipelines. Monitor, troubleshoot, and optimize the performance and cost of data pipelines. Implement security, and compliance standards and policies across data pipelines and data sets using techniques like data masking, encryption, tokenization.
About iCIMS
iCIMS is a software company.