Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Deep learning for population size history inference: Design, comparison and combination with approximate Bayesian computation

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      TAckling the Underspecified (TAU); Inria Saclay - Ile de France; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire de Recherche en Informatique (LRI); CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS); Bioinformatique (LRI) (BioInfo - LRI); Laboratoire de Recherche en Informatique (LRI)
    • بيانات النشر:
      HAL CCSD
      Wiley/Blackwell
    • الموضوع:
      2020
    • Collection:
      Archive ouverte HAL (Hyper Article en Ligne, CCSD - Centre pour la Communication Scientifique Directe)
    • نبذة مختصرة :
      International audience ; For the past decades, simulation-based likelihood-free inference methods have enabled researchers to address numerous population genetics problems. As the richness and amount of simulated and real genetic data keep increasing, the field has a strong opportunity to tackle tasks that current methods hardly solve. However, high data dimensionality forces most methods to summarize large genomic datasets into a relatively small number of handcrafted features (summary statistics). Here we propose an alternative to summary statistics, based on the automatic extraction of relevant information using deep learning techniques. Specifically, we design artificial neural networks (ANNs) that take as input single nucleotide polymorphic sites (SNPs) found in individuals sampled from a single population and infer the past effective population size history. First, we provide guidelines to construct artificial neural networks that comply with the intrinsic properties of SNP data such as invariance to permutation of haplotypes, long scale interactions between SNPs and variable genomic length. Thanks to a Bayesian hyperparameter optimization procedure, we evaluate the performance of multiple networks and compare them to well established methods like Approximate Bayesian Computation (ABC). Even without the expert knowledge of summary statistics, our approach compares fairly well to an ABC based on handcrafted features. Furthermore we show that combining deep learning and ABC can improve performance while taking advantage of both frameworks. Finally, we apply our approach to reconstruct the effective population size history of cattle breed populations.
    • Relation:
      hal-02942328; https://hal.archives-ouvertes.fr/hal-02942328; https://hal.archives-ouvertes.fr/hal-02942328/document; https://hal.archives-ouvertes.fr/hal-02942328/file/Sanchez_2020_final_overleaf.pdf
    • الرقم المعرف:
      10.1111/1755-0998.13224
    • الدخول الالكتروني :
      https://hal.archives-ouvertes.fr/hal-02942328
      https://hal.archives-ouvertes.fr/hal-02942328/document
      https://hal.archives-ouvertes.fr/hal-02942328/file/Sanchez_2020_final_overleaf.pdf
      https://doi.org/10.1111/1755-0998.13224
    • Rights:
      info:eu-repo/semantics/OpenAccess
    • الرقم المعرف:
      edsbas.46A85687