Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Design and analysis of two-phase studies with multivariate longitudinal data.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • المصدر:
      Publisher: Oxford University Press Country of Publication: United States NLM ID: 0370625 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1541-0420 (Electronic) Linking ISSN: 0006341X NLM ISO Abbreviation: Biometrics Subsets: MEDLINE
    • بيانات النشر:
      Publication: March 2024- : [Oxford] : Oxford University Press
      Original Publication: Alexandria Va : Biometric Society
    • الموضوع:
    • نبذة مختصرة :
      Two-phase studies are crucial when outcome and covariate data are available in a first-phase sample (e.g., a cohort study), but costs associated with retrospective ascertainment of a novel exposure limit the size of the second-phase sample, in whom the exposure is collected. For longitudinal outcomes, one class of two-phase studies stratifies subjects based on an outcome vector summary (e.g., an average or a slope over time) and oversamples subjects in the extreme value strata while undersampling subjects in the medium-value stratum. Based on the choice of the summary, two-phase studies for longitudinal data can increase efficiency of time-varying and/or time-fixed exposure parameter estimates. In this manuscript, we extend efficient, two-phase study designs to multivariate longitudinal continuous outcomes, and we detail two analysis approaches. The first approach is a multiple imputation analysis that combines complete data from subjects selected for phase two with the incomplete data from those not selected. The second approach is a conditional maximum likelihood analysis that is intended for applications where only data from subjects selected for phase two are available. Importantly, we show that both approaches can be applied to secondary analyses of previously conducted two-phase studies. We examine finite sample operating characteristics of the two approaches and use the Lung Health Study (Connett et al. (1993), Controlled Clinical Trials, 14, 3S-19S) to examine genetic associations with lung function decline over time.
      (© 2022 The International Biometric Society.)
    • References:
      Bjørnland, T., Bye, A., Ryeng, E., Wisløff, U. and Langaas, M. (2018) Powerful extreme phenotype sampling designs and score tests for genetic association studies. Statistics in Medicine, 37, 4234-4251.
      Breslow, N. and Chatterjee, N. (1999) Design and analysis of two-phase studies with binary outcome applied to wilms tumour prognosis. Journal of the Royal Statistical Society, Series C, 48, 457-468.
      Chatterjee, N., Chen, Y. and Breslow, N. (2003) A pseudoscore estimator for regression problems with two-phase sampling. Journal of the American Statistical Association, 98, 158-168.
      Connett, J., Kusek, J., Bailey, W., O'Hara, P. and Wu, M. (1993) Design of the Lung Health Study: A randomized clinical trial of early intervention for chronic obstructive pulmonary disease. Controlled Clinical Trials, 14, 3S-19S.
      Derkach, A., Lawless, J. F. and Sun, L. (2015) Score tests for association under response-dependent sampling designs for expensive covariates. Biometrika, 102, 988-994.
      Hansel, N., Ruczinski, I., Rafaels, N., Sin, D., Daley, D., Malinina, A., et al. (2013) Genome-wide study identifies two loci associated with lung function decline in mild to moderate COPD. Human Genetics, 132, 79-90.
      Harrell, F. (2016) Regression Modeling Strategies with Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis, 2nd edition. Switzerland: Springer International Publishing.
      Holt, D., Smith, T. and Winter, P. (1980) Regression analysis of data from complex survey. Journal of the Royal Statistical Society, Series A, 143, 474-487.
      Lawless, J., Kalbfleisch, J. and Wild, C. (1999) Semiparametric methods for response-selective and missing data problems in regression. Journal of the Royal Statistical Society, Series B, 61, 413-438.
      Lee, A., McMurchy, L. and Scott, A. (1997) Re-using data from case-control studies. Statistics in Medicine, 16, 1377-1389.
      Lin, D. and Zeng, D. (2009) Proper analysis of secondary phenotype data in case-control association studies. Genetic Epidemiology, 3, 256-265.
      Lin, D., Zeng, D. and Tang, Z. (2013) Quantitative trait analysis in sequencing studies under trait-dependent sampling. Proceedings of the National Academy of Sciences of the United States of America, 110, 12247-12252.
      Lin, H., Wang, M., Brody, J., Bis, J., Dupuis, J., Lumley, T., McKnight, B., et al. (2014) Strategies to design and analyze targeted sequencing data: cohorts for heart and aging research in genomic epidemiology (charge) consortium targeted sequencing study. Circulation: Cardiovascular Genetics, 7, 335-343.
      Pan, Y., Cai, J., Longnecker, M. and Zhou, H. (2018) Secondary outcome analysis for data from an outcome-dependent sampling design. Statistics in Medicine, 37, 2321-2337.
      Prentice, R. and Pyke, R. (1979) Logistic disease incidence models and case-control studies. Biometrika, 66, 403-411.
      Rubin, D. (1976) Inference and missing data. Biometrika, 63, 581-590.
      Schildcrout, J., Garbett, S. and Heagerty, P. (2013) Outcome vector dependent sampling with longitudinal continuous response data: stratified sampling based on summary statistics. Biometrics, 69, 405-416.
      Schildcrout, J., Haneuse, S., Tao, R., Zelnick, L., Schisterman, E., Garbett, S., et al. (2020) Two-phase, generalized case-control designs for quantitative longitudinal outcomes. American Journal of Epidemiology, 182, 81-90.
      Schildcrout, J., Rathouz, P., Zelnick, L., Garbett, S. and Heagerty, P. (2015) Biased sampling design to improve research efficiency: factors influencing pulmonary function over time in children with asthma. Annals of Applied Statistics, 9, 731-753.
      Song, R., Zhou, H. and Kosorok, M. (2009) A note on semiparametric efficient inference for two-stage outcome-dependent sampling with a continuous outcome. Biometrika, 96, 221-228.
      Sun, Z., Mukherjee, B., Estes, J., Vokonas, P. and Park, S. (2017) Exposure enriched outcome dependent designs for longitudinal studies of gene-environment interaction. Statistics in Medicine, 36, 2947-2960.
      Tao, R., Zeng, D., Franceschini, N., North, K., Boerwinkle, E. and Lin, D. (2015) Analysis of sequence data under multivariate trait-dependent sampling. Journal of the American Statistical Association, 110, 560-572.
      Tao, R., Zeng, D. and Lin, D. (2017) Efficient semiparametric inference under two-phase sampling, with applications to genetic association studies. Journal of the American Statistical Association, 112, 1468-1476.
      Tao, R., Zeng, D. and Lin, D. (2020) Optimal designs of two-phase studies. Journal of the American Statistical Association, 115, 1946-1959.
      Weaver, M. and Zhou, H. (2005) An estimated likelihood method for continuous outcome regression models with outcome-dependent sampling. Journal of the American Statistical Association, 100, 459-469.
      White, E. (1982) A two stage design for the study of the relationship between a rare exposure and a rare disease. American Journal of Epidemiology, 115, 119-128.
      White, I., Royston, P. and Wood, A. (2011) Multiple imputation using chained equations: issues and guidance for practice. Statistics in Medicine, 30, 377-399.
      Zelnick, L., Schildcrout, J. and Heagerty, P. (2018) Likelihood-based analysis of outcome-dependent sampling designs with longitudinal data. Statistics in Medicine, 37, 2120-2133.
      Zhou, H., Chen, J., Rissanen, T., Korrick, S., Hu, H., Salonen, J., et al. (2017) An efficient sampling and inference procedure for studies with a continuous outcome. Epidemiology, 18, 461-468.
      Zhou, H., Weaver, M., Qin, J., Longnecker, M. and Wang, M.C. (2002) A semiparametric empirical likelihood method for data from an outcome-dependent sampling scheme with a continuous outcome. Biometrics, 58, 413-421.
    • Grant Information:
      R01 HL094786 United States HL NHLBI NIH HHS
    • Contributed Indexing:
      Keywords: Lung Health Study; ascertainment corrected maximum likelihood; missing data; multiple imputation; outcome-dependent sampling; secondary outcome analysis
    • الموضوع:
      Date Created: 20220111 Date Completed: 20230621 Latest Revision: 20230701
    • الموضوع:
      20231215
    • الرقم المعرف:
      PMC9392467
    • الرقم المعرف:
      10.1111/biom.13616
    • الرقم المعرف:
      35014029