The Data Engineer role is a pivotal position within the Enterprise Data Engineering & Analytics Department, supporting the design, build, and operationalization of integrated data pipelines and analytics solutions that enable MD Anderson's digital business initiatives. The Data Engineer works across the Context Engine framework to deliver end-to-end data engineering solutions while partnering closely with Enterprise Data Engineering & Analytics teams and other institutional stakeholders.
The Data Engineer contributes to the mission of MD Anderson Cancer Center, a leading institution focused on cancer care, research, education, and prevention. In this role, the Data Engineer helps advance enterprise analytics capabilities by ensuring secure, governed, and reusable data assets that accelerate insights and improve time-to-solution across MD Anderson.
Ideal Candidate Statement
The ideal candidate for the Data Engineer role brings a bachelor's degree in computer science, preferred advanced education in analytics or computer science, hands-on experience building data pipelines in healthcare or research environments, and familiarity with modern cloud-based data platforms, hands-on use of Large Language Models (LLMs) in real-world projects, Python or Spark development, and analytics delivery. Epic data model exposure or certification and the ability to collaborate across technical and clinical teams are strongly preferred.
Position Information
Salary range based on a 40-hour work week: Minimum $106,500 - Midpoint $133,000 - Maximum $159,500
Work location: Houston, Texas or surrounding area preferred
This Data Engineer role offers the opportunity to contribute directly to MD Anderson's mission by enabling high-quality, governed data that supports clinical, research, and operational analytics across the institution. The position provides exposure to enterprise-scale data engineering initiatives, collaboration with experienced engineering and data science professionals, and opportunities for continued learning and career growth while supporting a balanced and sustainable work environment.
- Employer-paid medical coverage starting day one for employees working 30+ hours/week, plus optional group dental, vision, life, AD&D, and disability insurance.
- Accruals for PTO and Extended Illness Bank, plus paid holidays, wellness, childcare, and other leave options.
- Tuition Assistance Program after six months of service and access to extensive wellness, fitness, and employee resource groups.
- Defined-benefit pension through the Teachers Retirement System, voluntary retirement plans, and employer-paid life and reduced salary protection programs.
Responsibilities
Data Engineering - End-to-End Solution Delivery
- Participate in end-to-end solution delivery that increases information capabilities and realizes data value across the institution
- Build and test end-to-end data pipelines across ingestion, curation, transformation, modeling, and consumption within the Context Engine framework
- Integrate data governance processes across data provenance, security, data quality, ontology, and metadata management
- Participate in planning, architecture, analysis, design, and build of data pipelines in partnership with IS, Data Offices, and Data Governance teams
- Contribute to existing data pipelines spanning acquisition, integration, and consumption for defined use cases
Data Curation, Modeling, and Governance
- Build data curation pipelines including profiling, specification creation, cleansing, transforming, standardizing, mastering, harmonizing, validating, and aggregating data
- Monitor and support data quality across the Context Engine
- Incorporate repeatable solution designs and data models to support reuse and scalability
- Promote effective data management practices and understanding of analytics across the enterprise
Standards, Testing, and System Maintenance
- Adhere to IS division standard operating procedures and all MD Anderson policies
- Maintain build standards and governance oversight sign-off aligned with institutional data strategy
- Participate in documentation preparation for enhancements or new technology
- Perform quality control, testing, and peer review of analytics builds
- Support system updates, releases, change control processes, and after-hours support as required
Education, Training, and Collaboration
- Train data scientists, analysts, end users, and data consumers on data pipelining and preparation techniques
- Assist in establishing training plans and curricula for Context Engine tools
- Provide institutional, department, and one-on-one training on EDEA deliverables
- Support liaison relationships with customers and OneIS partners to deliver effective technical solutions
Innovation and Continuous Improvement
- Explore and promote modern tools, techniques, and architectures to automate data preparation and integration tasks
- Improve productivity by reducing manual and error-prone processes
- Model On