Architect batch + stream pipelines (Airflow, Kafka, dbt) for diverse structured and unstructured marked data. Provide reusable SDKs in Python and Go for internal data producers. Implement and tune S3, column‑oriented and time‑series data storage for petabyte‑scale analytics; own partitioning, compression, TTL, versioning and cost optimisation. Develop internal libraries for schema management, data contracts, validation and lineage.
Job listings
Lead the design and evolution of Reddit’s Data Pipeline to support scale and growth. Drive high-impact projects aligned with Reddit’s engineering and business goals. Build a high-performing team, mentor engineers, and establish best practices for data production and governance. Ensure pipeline reliability and collaborate with teams like Ads and ML.
Help lead the growing backend engineering team at StackAdapt. Architect scalable low-latency backend systems and data pipelines. Write code as needed to support the team. Lead and mentor a team of talented engineers within the backend distributed systems team; make a positive impact on the team's productivity and growth.
You'll be part of a big group of makers, breakers, doers and disruptors, who solve real problems and meet real customer needs. We are seeking Software & Data Engineers who are passionate about marrying data with emerging technologies. You’ll have the opportunity to be on the forefront of building a modern enterprise data platform that powers over $60 billion in annual transaction volume for over 100 million active payment cards.
Design, build, and maintain scalable data processing systems and analytics platforms. Lead our efforts to create robust, in-house data infrastructure solutions to support our growing data needs and business intelligence requirements. Design and implement efficient data storage and processing solutions for large-scale datasets; architect a new data processing framework to replace existing third-party solutions.
Work with a team of engineers to build first-party and third-party data integration with external data sources. Build and architect scalable low-latency backend systems and big data pipelines. Provide technical guidance in designing scalable solutions and best practices. Make a positive impact on the team's productivity and growth.