Vālenz® Health is the platform to simplify healthcare – the destination for employers, payers, providers and members to reduce costs, improve quality, and elevate the healthcare experience. The Valenz mindset and culture of innovation combine to create a distinctly different approach to an inefficient, uninspired health system. With fully integrated solutions, Valenz engages early and often to execute across the entire patient journey – from care navigation and management to payment integrity, plan performance and provider verification. With a 99% client retention rate, we elevate expectations to a new level of efficiency, effectiveness and transparency where smarter, better, faster healthcare is possible.
About This Opportunity:
As a Data Engineer III, you’ll be responsible for designing, building, and evolving scalable data systems that power analytics, product, and operational decision-making across the organization. You will operate as a senior individual contributor with end-to-end ownership of complex data initiatives, contributing directly to the architecture and evolution of our Databricks-based Lakehouse platform on Azure.
Things You’ll Do Here:
- Own the design and implementation of scalable, production-grade data pipelines using Databricks, PySpark, SQL, and Python Operationalize machine learning workflows and feature pipelines.
- Own and deliver complex, cross-functional data initiatives end-to-end, from ingestion and data modeling through production deployment and ongoing monitoring.
- Design robust, reusable ETL frameworks using Delta Lake best practices (incremental processing, merge/upserts, schema evolution).
- Diagnose and resolve performance challenges in distributed Spark workloads (data skew, shuffle, memory pressure, inefficient execution plans).
- Build and enforce strong data quality practices, including validation frameworks, observability, and automated alerting.
- Design and evolve data models across medallion architecture layers to support analytics and downstream applications.
- Implement modern data ingestion patterns, including API-driven, event-based, and AI-assisted ingestion workflows.
- Partner with analytics, architecture, and engineering teams to support advanced data use cases, including feature engineering and emerging machine learning workflows.
- Evaluate and adopt new capabilities within Azure and Databricks (e.g., MLflow, Unity Catalog enhancements, platform optimizations) to improve scalability and developer productivity.
- Contribute to architectural decisions and platform standards, balancing short-term delivery with long-term maintainability.
- Write high-quality, well-tested, and maintainable code; lead by example through thoughtful code reviews.
- Act as a go-to resource for diagnosing and resolving complex production issues across systems.
- Mentor and elevate other engineers through collaboration, design discussions, and technical guidance.
- Perform other duties as assigned.
Reasonable accommodation may be made to enable individuals with disabilities to perform essential duties.
What You’ll Bring to the Team:
- 4+ years of experience in data engineering or a related field, with a track record of delivering production-grade data systems
- Strong hands-on experience with Databricks, Spark/PySpark, and distributed data processing at scale
- Deep understanding of Delta Lake and modern Lakehouse architecture patterns
- Proficiency in Python and SQL for large-scale data transformation and performance optimization
- Proven experience building incremental, idempotent, and highly reliable data pipelines
- Strong experience diagnosing and optimizing Spark workloads (partitioning strategies, AQE, caching, file sizing, query tuning)
- Experience designing data models for analytics and downstream consumption (medallion architecture, dimensional modeling, or similar)
- Experience implementing data quality, validation, and observability frameworks in production environments
- Familiarity with CI/CD, version control, and modern DataOps practices
- Experience supporting or integrating with machine learning workflows (feature pipelines, model inputs/outputs, or ML lifecycle support)
- Familiarity with AI/ML concepts as applied to data engineering (intelligent ingestion, anomaly detection, automation)
- Demonstrated ability to evaluate and adopt new technologies within cloud ecosystems (Azure, Databricks)
- Strong communication skills and ability to collaborate with both technical and non-technical stakeholders
A plus if you have…
- Familiarity with event-driven architectures (e.g., streaming, message queues, or event hubs)
- Experience working with healthcare data (claims, eligibility, provider, or clinical datasets
Where You’ll Work: This is a fully remote position, and we’ll provide all the necessary equipment!
- Work Environment: You’ll need a quiet workspace that is free from distractions.
- Technology: Reliable internet connection—if you can use streaming services