Develop engineering expertise within the Dataiku Platform to help maintain and develop system integrations, platform automations, and platform configurations.
Build & maintain python & SQL data replication & data pipelines on large & often complex data sets.
Identify opportunities for improvements & optimization for greater scalability & delivery velocity
Build pipelines to load data from various systems into Dataiku via S3 or Snowflake.
Increase the robustness of existing production pipelines, identify bottlenecks, and set up a robust monitoring, testing processes, and documentation templates.
Build custom applications and integrations to automate manual tasks related to customer operations to help Product Operations / Support / SRE in their day-to-day activities
Dataiku is the Platform for AI Success, the enterprise orchestration layer for building, deploying, and governing AI. The world’s leading companies rely on Dataiku to operationalize AI and run it as a true business performance engine delivering measurable value.
Design, build, and maintain scalable data pipelines for clients across industries.
Architect and optimize cloud data warehouse solutions, adapting to each client's stack.
Collaborate with analysts and data scientists to ensure data is clean, reliable, and well-modeled.
NuView Analytics helps companies accelerate the time to insights from their data through data analytics, diligence, and fractional data science. They are a growth-stage company looking to drive additional value from the data they are sitting on and value humility, intellectual rigor, and stewardship.
Primarily responsible for analyzing data integrity challenges and identifying root cause analysis.
Craft client code that is efficient, performant, testable, scalable, and secure.
Actively participate in agile software development, including daily stand-ups and sprint planning.
3Pillar is a company where senior software engineers can collaborate with industry leaders and spearhead transformative projects that redefine urban living, establish new media channels, or drive innovation in healthcare. They are a global team that values well-being and offers flexible work environments.
Design and implement scalable data architectures to support business needs.
Build and optimize data pipelines, ensuring data accessibility and security.
Develop and maintain data models, databases, and data lakes, with robust data governance.
Terawatt Infrastructure delivers large scale, turnkey charging solutions for companies rapidly deploying AV and EV fleets. With a growing portfolio of sites across the US, Terawatt is building the permanent transportation and logistics infrastructure of tomorrow through capital, real estate, development, and site operations solutions.
You will join a team of talented engineers working closely with Data Scientists to build and scale our next-generation Ad EnGage data pipeline.
You will work with large-scale datasets (hundreds of TBs to petabyte-scale systems) using a modern data stack centered on AWS, Airflow, dbt, and Snowflake.
You’ll contribute to building reliable, high-quality data pipelines and improving the performance, scalability, and observability of our data platform.
EDO is the TV outcomes company. Their leading measurement platform connects convergent TV airings to the ad-driven consumer behaviors most predictive of future sales. They are headquartered in New York City and Los Angeles with an office space in San Francisco and recognize the benefits of hybrid working.
Build and manage business data pipelines and transform Firefox telemetry data into structured datasets.
Partner with data scientists, product, and marketing teams to turn datasets into models and metrics.
Ensure data accuracy and performance using observability tools and resolve data issues.
Mozilla Corporation is a technology company backed by a non-profit that has shaped the internet, creating brands like Firefox. With millions of users globally, they focus on areas including AI and social media while remaining focused on making the internet better for people.
Design, build, and maintain scalable batch and real-time data pipelines that power analytics, experimentation, and machine learning
Partner cross-functionally with analytics, product, engineering and operations to deliver high-quality data solutions that drive measurable business impact
Champion data quality, reliability, and observability by implementing best practices in testing, monitoring, lineage, and incident response
Gopuff is reimagining how people purchase everyday essentials, from snacks to household goods to alcohol, all delivered in minutes. They are assembling a team of thinkers, dreamers and risk-takers who know the value of peace of mind in an unpredictable world.
Partner closely with business stakeholders to understand their challenges and design end-to-end architecture.
Design, develop, and own robust, efficient, and scalable data models in Snowflake and Iceberg using dbt and advanced SQL.
Build and manage reliable data pipelines and CI/CD workflows using tools like Airflow, Python, and Terraform.
Motive empowers people who run physical operations with tools to make their work safer, more productive, and more profitable. Motive serves nearly 100,000 customers and provides complete visibility and control across a wide range of industries.
Collaborate with stakeholders to build robust services using data pipeline and ETL tools, and Snowflake data warehouse.
Translate advanced business data and analytics problems into technical approaches that yield actionable recommendations.
Communicate results and educate others through visualizations, reports, and presentations.
CNG Holdings, Inc. serves consumers by providing financial solutions which fill a need and deliver value. They strive to make a difference in their customers’ lives and the communities they serve.
Own organizational-wide data architecture, defining standards and designs.
Design and develop data pipelines, integrations, and platform features.
Partner with product managers to define new data features and capabilities.
They offer a connected equipment platform for managing mixed assets. The company values quality, continuous learning, and collaboration within a dynamic team environment.
Design, build, and maintain scalable data infrastructure to support analytics and reporting across the organization.
Develop and operate ETL pipelines to ingest, transform, and deliver large-scale datasets.
Partner closely with Data Analysts and cross-functional stakeholders to provide reliable datasets and guide them in using data effectively.
Truelogic is a leading provider of nearshore staff augmentation services headquartered in New York. With over two decades of experience, they deliver top-tier technology solutions to companies of all sizes. Their team of 600+ highly skilled tech professionals, based in Latin America, drives digital disruption by partnering with U.S. companies in their projects.
Lead the architecture and evolution of scalable, distributed data pipelines, ensuring high availability and performance at scale
Build and maintain distributed web scraping systems using tools such as Playwright, Selenium, and BeautifulSoup
Integrate AI and LLMs into engineering workflows for code generation, automation, and optimization
MercatorAI is building scalable data infrastructure to power high-quality, data-driven decision making at scale. As an early-stage company, the team is focused on creating robust, future-ready systems that can handle complex data ingestion, transformation, and delivery across a growing national footprint.
Lead, manage, and mentor a group of data engineers.
Own the design and development of data pipelines and systems.
Partner cross-functionally with Data Science and Product managers.
TrueML is a mission-driven financial software company that aims to create better customer experiences for distressed borrowers. The TrueML team includes inspired data scientists, financial services industry experts and customer experience fanatics building technology to serve people.
Architect, design, implement, and operate end-to-end data engineering solutions using Agile methodology.
Develop and manage robust data integrations with external vendors and organizations (including complex API integrations).
Collaborate closely with Data Analysts, Data Scientists, DBAs, and cross-functional teams to understand requirements and deliver high-impact data solutions.
SmartAsset is an online destination for consumer-focused financial information and advice, whose mission is helping people make smart financial decisions, reaching over an estimated 59 million people each month. A successful $110 million Series D funding round in 2021 valued the company at over $1 billion.
Create innovative solutions for handling peta-bytes of data with billions of rows & joins.
Create real time and offline features generation pipelines to managing our data infrastructure to be reliable and fast!
Develop and productionize data pipelines for our ML models in both bare-metal and the cloud environment.
Kayzen is a mobile demand-side platform (DSP) dedicated to democratizing programmatic advertising. They enable leading apps, agencies, media buyers, and brands to run programmatic customer acquisition, retargeting, and brand performance campaigns through their self-serve and managed service options.
Lead and grow a team of data engineers, providing mentorship and technical guidance.
Own execution of customer integrations across multiple product lines, ensuring on-time delivery.
Improve data quality and pipeline reliability by investing in better alerting and resilience.
Afresh is the leading AI company in fresh food, partnering with grocers to order billions of dollars of fresh food. They are on a mission to eliminate food waste and make fresh food accessible to all and has saved 200M lbs of food waste in 2025 alone.
Enable self-service analytics for all team members by designing clean, intuitive data models and metrics through dbt, empowering employees to make informed, data-driven decisions.
Develop and refine custom data pipelines that ingest data from operational systems to our analytics platform, handling both streaming and batch data using third-party tooling and home-grown solutions
Maintain and optimize the data platform infrastructure, focusing on data quality, ELT efficiency, and platform hygiene.
Auto Integrate makes leased vehicle maintenance frictionless for millions of customers in the USA and Canada. The business is managed by a small, global team within Fleetio, combining the resources of a scaled SaaS company with the agility of a niche market leader.
Organize and structure data systems at both macro and micro levels, designing and implementing data architectures that support business goalsOptimize data pipelines for performance, reliability, and scalability
Design, build, and maintain scalable ETL/ELT pipelines with Airflow to process large-scale, complex datasets
Demonstrate ability to delivery of of data products useful for machine learning and AI research and development (data models, metadata and semantics)
Owkin is an AI company on a mission to solve the complexity of biology. It is building the first Biology Super Intelligence (BASI) by combining powerful biological large language models, multimodal patient data, and agentic software.
Design, build, and maintain databases that power Hologram's operations.
Build and maintain ETL pipelines that move and transform data reliably.
Audit existing pipelines and data models, identify complexity, and refactor bad decisions.
Hologram is building the future of IoT connectivity, delivering internet access to millions of connected devices worldwide. They process over 5 billion transactions per month across their global infrastructure and values a fun, upbeat, and remote-first team united by their mission.
Design, build, and maintain production data pipelines using Python, Prefect, Airflow, Jenkins or any other orchestration framework multi-phase algorithmic workflows.
Build and optimize advanced SQL transformations in Snowflake, including window functions, CTEs, stored procedures, UDFs, and semi-structured data processing.
Build and maintain dbt models for data transformation, identity resolution, and slowly changing dimension (SCD Type 2) tracking across 80+ models and multiple pipeline stages.
Kalibri helps to redefine and rebuild the hotel industry. They are looking for passionate, energetic, and hardworking people with an entrepreneurial spirit, who dream big and challenge the status quo; their team is working on cutting-edge solutions for the industry.