About Realm of Caring
Realm of Caring (RoC) is a nonprofit advancing cannabinoid research and education. We generate real-world evidence to improve health outcomes through rigorous data science and translational research.
Position Summary
We’re hiring a part-time Research Data Scientist to lead end-to-end preparation of complex, large-scale health datasets for peer-reviewed publication. This role centers on cleaning, harmonizing, and structuring messy, multi-source datasets, followed by advanced statistical analysis and machine learning to generate publishable insights.
You’ll work with survey, observational, and real-world health data, building reproducible analytical workflows that meet academic research standards. This role is best suited for a PhD-trained data scientist or quantitative researcher with deep experience in machine learning, advanced statistics, and real-world data analysis.
Key Responsibilities
- Data Cleaning & Harmonization
- Clean, normalize, and integrate messy datasets from multiple sources (e.g., survey data from longitudinal studies)
- Resolve inconsistencies and schema mismatches across datasets
- Design scalable approaches to dataset harmonization for cross-study comparability
- Data Pipeline Development
- Build and maintain reproducible data processing workflows for large-scale datasets
- Structure datasets for downstream statistical modeling and publication-ready outputs
- Implement version-controlled workflows for data processing and analysis
- Statistical Analysis & Machine Learning
- Apply advanced statistical methods (e.g., mixed-effects models, causal inference, longitudinal modeling)
- Develop, validate, and interpret machine learning models for large-scale observational data as needed
- Ensure methodological rigor aligned with peer-reviewed research standards
- Partner with researchers to refine hypotheses, define analytic strategies, and interpret findings
- Translate complex analyses into clear, defensible results for academic publication
- Reproducibility & Publication Support
- Develop reproducible codebases and documentation (e.g., notebooks, pipelines)
- Prepare datasets, figures, and statistical outputs for manuscripts, abstracts, and reports
- Contribute to methodological transparency and auditability of analyses
- Technical publication-ready writing ability required (e.g., writing up Results and Methods sections for publication)
Qualifications
- PhD (preferred) in Data Science, Statistics, Biostatistics, Epidemiology, Computer Science, Experimental Psychology or a related quantitative field
- 3–5+ years experience working with large, complex datasets in research, healthcare, or applied data science
- Strong expertise in data cleaning, preprocessing, and dataset harmonization at scale
- Advanced proficiency in Python or R (e.g., pandas, tidyverse, scikit-learn, statsmodels) or related software/programming experience
- Deep experience with machine learning and advanced statistical methods
- Strong foundation in reproducible research practices
- Ability to communicate technical findings clearly to interdisciplinary teams and collaborate with team members to produce high quality publications
Preferred
- Prior experience preparing analyses for peer-reviewed publication
- Familiarity with survey data (Qualtrics, REDCap) and/or healthcare data standards (FHIR)
- Background in public health, epidemiology, or biostatistics
- Experience with causal inference, longitudinal analysis, or real-world evidence studies
- Experience working with messy, real-world observational datasets across multiple sources
- Familiarity with cloud or distributed data tools (AWS, GCP, or Spark)
- Background or familiarity in cannabinoid research
Contract Details
- Part-time: 10–20 hours/week
- 1-year contract with strong potential for renewal or full-time transition
- Flexible, output-driven work environment
Why This Role
- Lead analysis on large real-world datasets used in peer-reviewed research
- Opportunity to co-author manuscripts
- Work fully remotely and set your own schedule (with some set meetings required)
- High ownership over data quality, methodology, and analytical rigor
Opportunity to shape foundational datasets in an emerging research field and work with interdisciplinary teams across several renowned academic and scientific institutions
Pay: $45.00 - $50.00 per hour
Benefits:
Education:
Work Location: Remote