Chenyin Gao

alt text
View Details

About

I am a Postdoctoral Research Fellow at the Department of Biostatistics, Harvard University, working with Dr. Rui Duan. I obtained my Ph.D. in statistics from North Carolina State University in 2024, advised by Dr. Shu Yang. Prior to that, I obtained my Bachelor's degree in Statistics from Sun Yat-sen University in 2019.


Email:cgao@hsph.harvard.edu

Interests

  • Causal inference and missing data analyses.

  • Survey sampling.

  • Tensor analysis and deep learning.

Honors

  • Student Paper Award, Lifetime Data Science, ASA, 2025

  • MBB Interdisciplinary Mind Grant, Harvard, 2024

  • Student Paper Award, ICSA, 2024

  • Paige Plagge Graduate Award for Citizenship, NCSU, 2024

  • Best Poster Award, DISS, 2024

  • Student and Early-Career Travel Award, JSM, 2023

  • Chinese National Scholarship, 2018

Internship & Training

Publications

Statistics

  1. Doubly robust omnibus sensitivity analysis of externally controlled trials with intercurrent events [arXiv]
    C. Gao, X. Zhang, S. Yang (2025), Biometrics, accepted.

  2. Real effect or bias? Best practices for evaluating the robustness of real-world evidence through quantitative sensitivity analysis for unmeasured confounding. [arXiv]
    D. Faries, C. Gao, X. Zhang, C. Hazlett, J. Stamey, S. Yang, et al. (2024), Pharmaceutical Statistics, DOI:10.1002/pst.2457.

  3. Improving randomized controlled trial analysis with data-adaptive borrowing [arXiv]
    C. Gao, S. Yang, M. Shan, W. Ye, I. Lipkovich, and D. Faries (2024), Biometrika, accepted.

    ** Winner of the 2024 ICSA Student Paper Award

    ** Winners of the 2024 DISS Poster Contest

  4. Estimating spatially varying health effects in app-based citizen science research [arXiv]
    L. Wu*, C. Gao*, S. Yang, B. J. Reich, and A. Rappold (2024), Journal of the Royal Statistical Society: Series C.

    * equal contribution

    ** Winner of the 2021 ASA Section on Statistics in Epidemiology Young Investigator Award

    ** Winner of the IMB Student Research Award from the 34th New England Statistics Symposium

  5. Causal customer churn analysis with low-rank tensor block hazard model [arXiv]
    C. Gao, Z. Zhang, and S. Yang (2024), International Conference on Machine Learning.

  6. Transporting survival of an HIV clinical trial to the external target populations
    D. Lee, C. Gao, S. Ghosh, and S. Yang (2024), Journal of Biopharmaceutical Statistics, DOI: 10.1080/10543406.2024.2330216.

  7. Pretest estimation in combining probability and non-probability samples [arXiv]
    C. Gao and S. Yang (2023), Electronic Journal of Statistics, 17 (1), 1492-1546.

  8. Soft calibration for correcting selection bias under mixed-effects models [arXiv]
    C. Gao, S. Yang, and J. K. Kim (2023), Biometrika, 110 (4), 897-911.

  9. Elastic integrative analysis of randomized trial and real-world data for treatment heterogeneity estimation [arXiv]
    S. Yang, C. Gao, X. Wang, and D. Zeng (2023), Journal of the Royal Statistical Society: Series B, 85 (3), 575-596.

  10. Nearest neighbor ratio imputation with incomplete multinomial outcome in survey sampling [arXiv]
    C. Gao, K. J. Thompson, S. Yang and J. K. Kim (2022), Journal of the Royal Statistical Society: Series A, 185 (4), 1903-1930.

Collaborative Research

  1. Enhancing convolutional neural network generalizability via low-rank weight approximation
    C. Gao, S. Yang, A.R. Zhang (2024), IET Image Processing, DOI:10.1049/ipr2.13205.

  2. Where does the risk lie? Systemic risk and tail risk networks in the Chinese financial market
    Y. Deng, C. Gao (2022), Pacific Economic Review, 28 (2), 167-190.

  3. Advanced trophectoderm quality increases the risk of a large for gestational age baby in single frozen-thawed blastocyst transfer cycles
    Q. Xie, T. Du, M. Zhao, C. Gao, Q. Lyu, L. Suo, Y. Kuang (2021), Human Reproduction 36 (8), 2111-2120.

Technical Reports

  1. Causal inference on sequential treatments via tensor completion. [arXiv]
    C. Gao, A.R. Zhang, and S. Yang (202x), submitted.

  2. Evaluation of machine learning approaches for estimating optimal individualized treatment regimens for time-to-event outcomes in observational studies [arXiv]
    I. Lipkovich, Z. Kadziola, C. Gao, D. Wang, D. Faries (202x), submitted.

  3. Doubly protected estimation for survival outcomes utilizing external controls for randomized clinical trials [arXiv]
    C. Gao, S. Yang, M. Shan, W. Ye, I. Lipkovich, and D. Faries (2024), submitted.

    ** Winner of the 2025 ASA Section on Lifetime Data Science Student Paper Award

  4. On the Role of Surrogates in Conformal Inference of Individual Causal Effects [arXiv]
    C. Gao, P. B Gilbert, and L. Han (2024), submitted.

  5. Unsupervised Ensemble Learning for Efficient Integration of Pre-trained Polygenic Risk Scores [arXiv]
    C. Gao, J. D. Tubbs, Y. Han, M. Guo, S. Li, E. Ma, D. Luo, J. W. Smoller, P. H. Lee, and R. Duan (2025), submitted.

Software

R packages for integrative analysis:

  • ElasticIntegrative implements a test-based analysis for the heterogeneous treatment effects combining trials and real-world data [arXiv]
  • SelectiveIntegrative implements dynamically penalized borrowing framework to incorporate information from other external-control (EC) datasets with the gold-standard randomized trials [arXiv]

Python codes for tensor completion for causal analysis:

Service