Responsibilities:
- Develop data pipelines to onboard data from APIs, databases, and flat files into the Azure/Databricks environment.
- Apply de-duplication using deterministic and probabilistic matching techniques for large-scale datasets.
- Create data tagging frameworks for metadata classification to support governance and lineage tracking.
Qualifications:
- Requires 13+ years of overall IT experience, including 5+ years in data ingestion, de-duplication, and data tagging.
- Must have 2+ years of experience using Databricks or other Spark-based platforms and fluency in a scripting language like Python.
- Must be a US Citizen or Permanent Resident, able to obtain a Position of Public Trust Clearance.
Additional Information:
- This is a remote, contract-to-hire position with the work originating from Raleigh, NC.
- Experience with Azure Cloud operations, DevOps tools, and technologies like SAS or Tableau is considered a plus.
Tier One Technologies
Tier One Technologies supports US Government clients with technology solutions and services. The company operates with a focus on secure, large-scale data projects, seeking experienced professionals for specialized contract-to-hire roles.