Data Engineer – Data Lake Migration

Jobs via Dice
Dallas, US
On-site

Job Description

Dice is the leading career destination for tech experts at every stage of their careers. Our client, Value Technology Inc, is seeking the following. Apply via Dice today!

Job Title: Data Engineer – Data Lake Migration

Location: Dallas, TX (Onsite)

Experience: 8-10 years

Role Summary

Value Technology is seeking a Data Engineer to join a high-impact datastore migration initiative, focused on migrating data from on-premise Data Lakes to an AWS-based Lakehouse architecture. This role involves end-to-end data pipeline migration, transformation of legacy consumption patterns, and ensuring data quality and integrity across modern data platforms.

Key Responsibilities

Data Migration & Pipeline Engineering

  • Refactor and migrate data pipelines, extraction logic, and job scheduling from legacy systems to modern Lakehouse architecture.
  • Execute large-scale data transfers ensuring accuracy, completeness, and consistency.
  • Work with file formats such as JSON, Avro, and Parquet for efficient data processing.

Consumption Pattern Migration

  • Convert and optimize legacy SQL and Apache Spark-based workloads for modern platforms.
  • Migrate and adapt datasets to Snowflake and Apache Iceberg environments.
  • Analyze existing data usage patterns to design optimized data delivery solutions.

Data Quality & Reconciliation

  • Perform data validation and reconciliation to ensure migrated data matches production standards.
  • Build and utilize reconciliation frameworks to validate data correctness and completeness.

Stakeholder Collaboration

  • Act as a technical liaison between engineering teams and business stakeholders.
  • Drive data hand-off and sign-off processes, ensuring alignment with business requirements.
  • Provide regular updates and participate in stakeholder discussions.

Platform & Integration

  • Collaborate with internal data platform teams to adopt new tools, workflows, and frameworks.
  • Work with distributed systems and data storage frameworks such as Hadoop (HDFS/Hive).

Required Technical Skills

Programming & Data Processing

  • Strong proficiency in Python or Java
  • Hands-on experience with Apache Spark
  • Strong expertise in ANSI SQL

Data Platforms & Tools

  • Experience with:
  • Snowflake
  • Apache Iceberg
  • Hadoop (HDFS, Hive)
  • Kafka (data streaming)
  • Sybase IQ

Data Formats & Integration

  • Knowledge of JSON, Avro, Parquet
  • Experience with data ingestion mechanisms (e.g., FTP)

DevOps & Methodology

  • Strong understanding of SDLC and CI/CD practices
  • Experience with Kubernetes (K8s) deployments

Core Data Engineering Competencies

  • Temporal Data Modeling: Handling historical data and SCD (e.g., SCD Type 2)
  • Schema Management: Schema evolution and enforcement (especially with Iceberg)
  • Performance Optimization: Partitioning, clustering, and query tuning
  • Data Architecture:
  • Normalization vs. Denormalization
  • Natural vs. Surrogate Keys

Additional Skills

  • Strong troubleshooting and debugging skills (SQL & pipelines)
  • Ability to quickly learn new tools, frameworks, and workflows
  • Experience working with large-scale, distributed data systems

Skills & Requirements

Technical Skills

PythonJavaApache sparkAnsi sqlSnowflakeApache icebergHadoopKafkaSybase iqJsonAvroParquetKubernetesData engineering

Employment Type

FULL TIME

Level

senior

Posted

4/21/2026

Continue to LinkedIn

You will be redirected to the job posting on LinkedIn.