Data Engineer III

Valenz
Phoenix, US
Remote

Job Description

Vālenz® Health is the platform to simplify healthcare – the destination for employers, payers, providers and members to reduce costs, improve quality, and elevate the healthcare experience. The Valenz mindset and culture of innovation combine to create a distinctly different approach to an inefficient, uninspired health system. With fully integrated solutions, Valenz engages early and often to execute across the entire patient journey – from care navigation and management to payment integrity, plan performance and provider verification. With a 99% client retention rate, we elevate expectations to a new level of efficiency, effectiveness and transparency where smarter, better, faster healthcare is possible.

About This Opportunity:

As a Data Engineer III, you’ll be responsible for designing, building, and evolving scalable data systems that power analytics, product, and operational decision-making across the organization. You will operate as a senior individual contributor with end-to-end ownership of complex data initiatives, contributing directly to the architecture and evolution of our Databricks-based Lakehouse platform on Azure.

Things You’ll Do Here:

  • Own the design and implementation of scalable, production-grade data pipelines using Databricks, PySpark, SQL, and Python Operationalize machine learning workflows and feature pipelines.
  • Own and deliver complex, cross-functional data initiatives end-to-end, from ingestion and data modeling through production deployment and ongoing monitoring.
  • Design robust, reusable ETL frameworks using Delta Lake best practices (incremental processing, merge/upserts, schema evolution).
  • Diagnose and resolve performance challenges in distributed Spark workloads (data skew, shuffle, memory pressure, inefficient execution plans).
  • Build and enforce strong data quality practices, including validation frameworks, observability, and automated alerting.
  • Design and evolve data models across medallion architecture layers to support analytics and downstream applications.
  • Implement modern data ingestion patterns, including API-driven, event-based, and AI-assisted ingestion workflows.
  • Partner with analytics, architecture, and engineering teams to support advanced data use cases, including feature engineering and emerging machine learning workflows.
  • Evaluate and adopt new capabilities within Azure and Databricks (e.g., MLflow, Unity Catalog enhancements, platform optimizations) to improve scalability and developer productivity.
  • Contribute to architectural decisions and platform standards, balancing short-term delivery with long-term maintainability.
  • Write high-quality, well-tested, and maintainable code; lead by example through thoughtful code reviews.
  • Act as a go-to resource for diagnosing and resolving complex production issues across systems.
  • Mentor and elevate other engineers through collaboration, design discussions, and technical guidance.
  • Perform other duties as assigned.

Reasonable accommodation may be made to enable individuals with disabilities to perform essential duties.

What You’ll Bring to the Team:

  • 4+ years of experience in data engineering or a related field, with a track record of delivering production-grade data systems
  • Strong hands-on experience with Databricks, Spark/PySpark, and distributed data processing at scale
  • Deep understanding of Delta Lake and modern Lakehouse architecture patterns
  • Proficiency in Python and SQL for large-scale data transformation and performance optimization
  • Proven experience building incremental, idempotent, and highly reliable data pipelines
  • Strong experience diagnosing and optimizing Spark workloads (partitioning strategies, AQE, caching, file sizing, query tuning)
  • Experience designing data models for analytics and downstream consumption (medallion architecture, dimensional modeling, or similar)
  • Experience implementing data quality, validation, and observability frameworks in production environments
  • Familiarity with CI/CD, version control, and modern DataOps practices
  • Experience supporting or integrating with machine learning workflows (feature pipelines, model inputs/outputs, or ML lifecycle support)
  • Familiarity with AI/ML concepts as applied to data engineering (intelligent ingestion, anomaly detection, automation)
  • Demonstrated ability to evaluate and adopt new technologies within cloud ecosystems (Azure, Databricks)
  • Strong communication skills and ability to collaborate with both technical and non-technical stakeholders

A plus if you have…

  • Familiarity with event-driven architectures (e.g., streaming, message queues, or event hubs)
  • Experience working with healthcare data (claims, eligibility, provider, or clinical datasets

Where You’ll Work: This is a fully remote position, and we’ll provide all the necessary equipment!

  • Work Environment: You’ll need a quiet workspace that is free from distractions.
  • Technology: Reliable internet connection—if you can use streaming services

Skills & Requirements

Technical Skills

DatabricksPysparkSqlPython

Employment Type

FULL TIME

Level

senior

Posted

4/10/2026

Apply Now

You will be redirected to Valenz's application portal.