Data Engineer III

Valenz

Phoenix, US

Remote

Job Description

Vālenz® Health is the platform to simplify healthcare – the destination for employers, payers, providers and members to reduce costs, improve quality, and elevate the healthcare experience. The Valenz mindset and culture of innovation combine to create a distinctly different approach to an inefficient, uninspired health system. With fully integrated solutions, Valenz engages early and often to execute across the entire patient journey – from care navigation and management to payment integrity, plan performance and provider verification. With a 99% client retention rate, we elevate expectations to a new level of efficiency, effectiveness and transparency where smarter, better, faster healthcare is possible.

About This Opportunity:

As a Data Engineer III, you’ll be responsible for designing, building, and evolving scalable data systems that power analytics, product, and operational decision-making across the organization. You will operate as a senior individual contributor with end-to-end ownership of complex data initiatives, contributing directly to the architecture and evolution of our Databricks-based Lakehouse platform on Azure.

Things You’ll Do Here:

Own the design and implementation of scalable, production-grade data pipelines using Databricks, PySpark, SQL, and Python Operationalize machine learning workflows and feature pipelines.
Own and deliver complex, cross-functional data initiatives end-to-end, from ingestion and data modeling through production deployment and ongoing monitoring.
Design robust, reusable ETL frameworks using Delta Lake best practices (incremental processing, merge/upserts, schema evolution).
Diagnose and resolve performance challenges in distributed Spark workloads (data skew, shuffle, memory pressure, inefficient execution plans).
Build and enforce strong data quality practices, including validation frameworks, observability, and automated alerting.
Design and evolve data models across medallion architecture layers to support analytics and downstream applications.
Implement modern data ingestion patterns, including API-driven, event-based, and AI-assisted ingestion workflows.
Partner with analytics, architecture, and engineering teams to support advanced data use cases, including feature engineering and emerging machine learning workflows.
Evaluate and adopt new capabilities within Azure and Databricks (e.g., MLflow, Unity Catalog enhancements, platform optimizations) to improve scalability and developer productivity.
Contribute to architectural decisions and platform standards, balancing short-term delivery with long-term maintainability.
Write high-quality, well-tested, and maintainable code; lead by example through thoughtful code reviews.
Act as a go-to resource for diagnosing and resolving complex production issues across systems.
Mentor and elevate other engineers through collaboration, design discussions, and technical guidance.
Perform other duties as assigned.

Reasonable accommodation may be made to enable individuals with disabilities to perform essential duties.

What You’ll Bring to the Team:

4+ years of experience in data engineering or a related field, with a track record of delivering production-grade data systems
Strong hands-on experience with Databricks, Spark/PySpark, and distributed data processing at scale
Deep understanding of Delta Lake and modern Lakehouse architecture patterns
Proficiency in Python and SQL for large-scale data transformation and performance optimization
Proven experience building incremental, idempotent, and highly reliable data pipelines
Strong experience diagnosing and optimizing Spark workloads (partitioning strategies, AQE, caching, file sizing, query tuning)
Experience designing data models for analytics and downstream consumption (medallion architecture, dimensional modeling, or similar)
Experience implementing data quality, validation, and observability frameworks in production environments
Familiarity with CI/CD, version control, and modern DataOps practices
Experience supporting or integrating with machine learning workflows (feature pipelines, model inputs/outputs, or ML lifecycle support)
Familiarity with AI/ML concepts as applied to data engineering (intelligent ingestion, anomaly detection, automation)
Demonstrated ability to evaluate and adopt new technologies within cloud ecosystems (Azure, Databricks)
Strong communication skills and ability to collaborate with both technical and non-technical stakeholders

A plus if you have…

Familiarity with event-driven architectures (e.g., streaming, message queues, or event hubs)
Experience working with healthcare data (claims, eligibility, provider, or clinical datasets

Where You’ll Work: This is a fully remote position, and we’ll provide all the necessary equipment!

Work Environment: You’ll need a quiet workspace that is free from distractions.
Technology: Reliable internet connection—if you can use streaming services

Skills & Requirements

Technical Skills

DatabricksPysparkSqlPython

Employment Type

FULL TIME

Level

senior

Posted

4/10/2026

Apply Now

You will be redirected to Valenz's application portal.