Job Description
Key responsibilities include architecting batch and stream pipelines using Airflow, Kafka, and dbt for diverse structured and unstructured marked data, as well as providing reusable SDKs in Python and Go for internal data producers. You will implement and tune S3, column-oriented, and time-series data storage for petabyte-scale analytics, owning partitioning, compression, TTL, versioning, and cost optimization. The role involves developing internal libraries for schema management, data contracts, validation, and lineage. This position requires partnering with Data Science, Quant Research, Backend, and DevOps to translate requirements into platform capabilities and evangelize best practices.
About Google Chrome Microsoft Edge Apple Safari Mozilla Firefox
They are a proprietary algorithmic trading firm with a team of 200+ professionals, with a strong emphasis on technology, and operate as a fully remote organization.