Workshop in Biostatistics: Communication-efficient distributed estimation and inference for Cox's model
MSOB Room X303
ABSTRACT: Motivated by multi-center biomedical studies that cannot share individual data due to privacy and ownership concerns, we develop communication-efficient iterative distributed algorithms for estimation and inference in the high- dimensional sparse Cox proportional hazards model. We demonstrate that our estimator, with a relatively small number of iterations, achieves the same convergence rate as the ideal full-sample estimator under very mild conditions. To construct confidence intervals for linear combinations of high-dimensional hazard regression coefficients, we introduce a novel debiased method, establish central limit theorems, and provide consistent variance estimators that yield asymptotically valid distributed confidence intervals. In addition, we provide valid and powerful distributed hypothesis tests for any of its coordinate elements based on decorrelated score test. We allow time-dependent covariates as well as censored survival times. Extensive numerical experiments on both simulated and real data lend further support to our theory and demonstrate that our communication-efficient distributed estimators, confidence intervals, and hypothesis tests improve upon alternative methods. (Joint work with Pierre Bayle and Zhipeng Lou).
Website: https://fan.princeton.edu/