Source Job

US 4w PTO

  • Leverage test-driven development to deliver backend systems and user interfaces for healthcare data integration.
  • Design, implement, and maintain data models, ETL processes, and APIs for performance and scalability.
  • Contribute to automated testing suites and optimize data operations for integrity and security.

Python Apache Spark Data Modeling ETL AWS

20 jobs similar to Engineer II, Data

Jobs ranked by similarity.

Mexico

  • Contribute to the design and implementation of scalable data solutions.
  • Build and optimize batch and streaming ingestion pipelines.
  • Ensure data quality, reliability, and performance across pipelines and datasets.

Blend is an AI services provider that co-creates impact for clients through data science, AI, technology, and people. They aim to fuel bold visions by aligning human expertise with artificial intelligence, fostering innovation, and unlocking value for their clients.

$90,000–$120,000/yr
US 4w PTO

  • Design, build, and maintain scalable data pipelines using Python, Spark, and Airflow.
  • Collaborate cross-functionally with AI/ML and Product teams to implement new features.
  • Proactively identify and resolve bottlenecks in our complex ETL processes.

Sayari provides judgment infrastructure for trustworthy AI in economic security and commercial risk. They resolve primary-source records forming the ground truth of global commerce, and are headquartered in Washington, D.C., with offices in London, Singapore, Tokyo, and Tel Aviv.

$110,000–$125,000/yr
US Unlimited PTO 12w paternity

  • Design, develop, and maintain robust, scalable ETL/ELT data pipelines using Python, SQL, and data processing frameworks.
  • Implement data quality checks, monitoring, and alerting across all data pipelines to ensure data integrity and reliability.
  • Work closely with data analysts, data scientists, and business intelligence engineers to understand their data requirements and deliver reliable, high-quality data access.

InStride Health delivers specialty anxiety and OCD care. They focus on expanding access to insurance-based care, increasing engagement, and improving treatment outcomes by combining clinical care and innovative technology. They are a mission-driven company.

Canada

  • Be the Analytics Engineering lead within the Sales and Marketing organization.
  • Be the data steward for Sales and Marketing: architect and improve the collection of underlying data.
  • Develop and maintain robust data pipelines and workflows for data ingestion, processing, and transformation.

Reddit is a community of communities, built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet. With 100,000+ active communities and millions of daily active unique visitors, Reddit is one of the internet’s largest sources of information.

$65,705–$87,606/yr
Canada

  • Design, build, and maintain scalable data infrastructure using modern cloud technologies.
  • Develop robust batch and streaming data pipelines to ingest, process, and serve data.
  • Contribute to the implementation of a modern data lakehouse architecture.

Jobgether uses an AI-powered matching process to ensure applications are reviewed quickly, objectively, and fairly. The system identifies the top-fitting candidates and shares this shortlist with the hiring company.

$190,000–$280,500/yr
US Canada

  • Architect and evolve scalable data ingestion and egress frameworks and pipelines that are well tested and offer strong data quality monitoring.
  • Architect and evolve our CI/CD processes - enhancing the testing environment and observability.
  • Enhance our Claude Code / LLM development support capabilities - creating tools / skills / agents that give our LLMs more context and help us continually improve their abilities to debug, create code, and maintain systems.

Life360’s mission is to keep people close to the ones they love. They have a mobile app, tracking devices, and a pet GPS tracker. Life360 has more than 500 (and growing!) remote-first employees and delivers peace of mind and enhances everyday family life.

United States

  • Lead the design and evolution of scalable financial data systems supporting commissions, incentives, and payments.
  • Build and maintain robust data pipelines using Python, SQL, Spark, and Terraform for accuracy and performance.
  • Define technical strategy and roadmap for financial data operations in collaboration with stakeholders.

Our partner is a fast-growing technology company building financial data infrastructure for insurance operations. They have a remote-friendly work environment and emphasize engineering excellence and cross-functional collaboration.

United States

  • Build and improve scalable, fault-tolerant, self-serve data infrastructure technologies to support ML and analytics workflows.
  • Own the Data Movement Platform for batch and stream data processing, and invest in building new infrastructure for Spark, Flink, and Airflow.
  • Collaborate with teammates on on-call responsibilities and monitoring/alerting to improve reliability, scalability, latency, and efficiency.

Reddit is a community of communities built on shared interests, passion, and trust, hosting the most open and authentic conversations on the internet. With over 100,000 active communities and approximately 126 million daily active unique visitors, Reddit is one of the internet's largest sources of information.

$130,000–$160,000/yr
US

  • Build, maintain, and run CI/CD pipelines and infrastructure-as-code for the Smile Digital Health platform.
  • Provision, configure, and operate cloud-based Spark clusters and distributed data processing environments.
  • Design and maintain scalable, secure infrastructure templates and deployment automation across cloud environments.

Smile Digital Health makes it easy for healthcare stakeholders to collect and exchange data with our leading FHIR-based data liberation platform. At its heart, the Smile platform enables people and organizations to better manage healthcare data; the company was #19 on Deloitte's Technology Fast 50 Ranking for 2024!

  • Design, build, and maintain scalable data pipelines using AWS Glue (PySpark), or equivalent orchestration and transformation tools.
  • Engineer and optimise the ClickHouse warehouse for sub-second query performance across all back-offices.
  • Implement data contracts between back-office and the platform.

Block Labs is a premier technology studio operating at the bleeding edge of Web3, Artificial Intelligence, and iGaming. We are a collective of senior engineers, product strategists, and builders who refuse to compromise on architecture.

$4,200–$5,200/mo
Global

  • Design, develop, and maintain scalable ETL/ELT data pipelines using Python.
  • Process and integrate data from multiple formats and sources (JSON, CSV, XML).
  • Build and manage data transformations and orchestration workflows using dbt and orchestration tools such as Airflow, Prefect, or Dagster.

I lack information about the company from the job posting. Please provide information about what the company does, size/employees, and culture, and I will fill this section out.

$123,696–$254,667/yr
US

  • Design and implement robust data infrastructure in AWS, using Spark with Scala.
  • Evolve our core data pipelines to efficiently scale for our massive growth.
  • Store data in optimal engines and formats, matching your designs to our performance needs and cost factors.

tvScientific is the first and only CTV advertising platform purpose-built for performance marketers. Our solution combines media buying, optimization, measurement, and attribution in one, efficient platform. Our platform is built by industry leaders with a long history in programmatic advertising, digital media, and ad verification.

Global

  • Design and implement modern data platforms and scalable data pipelines to enable better data-driven decisions.
  • Develop and maintain ETL/ELT pipelines using SQL, Spark/PySpark, and Microsoft Fabric or Databricks.
  • Work closely with data architects, BI developers, and customer stakeholders in an Agile environment.

Tieto, through MentorMate, creates durable technical solutions that deliver digital transformation at scale by blending strategic insights and thoughtful design with brilliant engineering. The company provides its people with the opportunity to work on impactful, global projects for recognizable brands.

Canada

  • Design, build, and operate high-scale data ingestion and replication systems from production data stores into the data lakehouse.
  • Build and maintain reliable, scalable data platform infrastructure capable of handling petabytes of data across analytics, AI, and operational use cases.
  • Develop internal libraries, APIs, frameworks, and tooling in languages such as Go and Python to help teams move and access data safely.

Samsara is the pioneer of the Connected Operations Cloud, enabling organizations that depend on physical operations to harness IoT data for actionable insights. As a publicly traded company, Samsara fosters a growth-oriented culture and serves industries that represent over 40% of global GDP.

$105,000–$125,000/yr
US

  • Design, develop, and maintain ETL processes.
  • Collaborate with stakeholders to gather requirements.
  • Develop and optimize data pipelines for efficiency.

Mind Computing supports the Department of Veterans Affairs with data solutions. They are an equal opportunity employer and value diversity.

$160,000–$190,000/yr
US Canada Unlimited PTO

  • Own and maintain data pipeline architectures, ensuring reliability and monitoring.
  • Manage and evolve data modeling environments for analysts and engineers.
  • Implement observability for data systems, detecting issues early and continuously monitoring data quality.

Voltus unlocks the full value of distributed energy resources for customers and the grid. They are a fast-growing climate-tech company with a bright, gritty, and good team that values innovation, impact, and integrity.

US

  • Lead the design and evolution of the data platform architecture, establishing patterns and standards the team builds on.
  • Build and operate production-grade data pipelines that ingest and transform high-variance, real-world clinical data reliably and at scale.
  • Contribute to quarterly data product releases, working closely with product, clinical, and customer success teams to meet commitments.

Verantos is the market leader in high-accuracy real-world evidence (RWE) generation. The Verantos RWE platform integrates heterogeneous real-world data sources and generates evidence with the accuracy necessary for regulatory and reimbursement use, serving some of the largest biopharma companies globally.

$190,000–$210,000/yr
US Unlimited PTO

  • Lead, coach, and develop a team of analytics engineers and/or data engineers.
  • Ensure on-time delivery of client data integrations by owning enterprise data model standards and maintaining consistent, governed data definitions.
  • Oversee client data pipelines using modern tooling (dbt, Airflow, Snowflake, AWS, Python) to ensure reliable operation and uptime.

SmarterDx builds clinical AI that is transforming how hospitals translate care into payment. Founded by physicians in 2020, their platform connects clinical context with revenue intelligence, helping health systems recover millions in missed revenue, improve quality scores, and appeal every denial.

US

  • Lead workspace architecture, Unity Catalog governance, and cluster policy design for client tenant organizations.
  • Perform tenant discovery, requirements gathering, source profiling, and security classification for new data intake requests.
  • Develop end-to-end technical designs for tenant onboarding, including Data Sharing Agreements and SLA documentation.

M9 Solutions provides IT services and solutions to the Federal Government, mobilizing skilled people and technologies for improved performance and sustainable change. With 15+ years of proven delivery and growth, the company has been recognized as an Inc. 5000 Fastest-Growing Private Company multiple times and values diverse perspectives.

$86,400–$138,600/yr
US

  • Design, develop, and maintain scalable data pipelines and infrastructure.
  • Build and optimize data warehouses, databases, and data models.
  • Implement and maintain data governance and security practices.

Jobgether is a company that uses an AI-powered matching process to ensure applications are reviewed quickly, objectively, and fairly. They connect candidates with companies; their culture is collaborative and inclusive, focused on innovation and growth.