Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Predicting haplogroups using a versatile machine learning program (PredYMaLe) on a new mutationally balanced 32 Y-STR multiplex (CombYplex): unlocking the full potential of the human STR mutation rate spectrum to estimate forensic parameters

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      Laboratoire d'Anthropologie Moléculaire, Institut de Médecine Légale, Strasbourg, France (IML); Université Louis Pasteur - Strasbourg I; Anthropologie Moléculaire et Imagerie de Synthèse (AMIS); Université Toulouse III - Paul Sabatier (UT3); Université de Toulouse (UT)-Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS); Éco-Anthropologie (EA); Muséum national d'Histoire naturelle (MNHN)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité); Anthropologie Bio-Culturelle (UAABC); Université de la Méditerranée - Aix-Marseille 2-Centre National de la Recherche Scientifique (CNRS); Groupe d'Activité de Médecine de la Reproduction CHU Toulouse (CECOS Toulouse); Centre Hospitalier Universitaire de Toulouse (CHU Toulouse); Département d'Urologie-Andrologie et Transplantation Rénale CHU Toulouse; Pôle Urologie - Néphrologie - Dialyse - Transplantations - Brûlés - Chirurgie plastique - Explorations fonctionnelles et physiologiques CHU Toulouse; Centre Hospitalier Universitaire de Toulouse (CHU Toulouse)-Centre Hospitalier Universitaire de Toulouse (CHU Toulouse); Institut national d'études démographiques (INED); Departamento de Zoologia y Antropologia Fisica; Universidad Complutense de Madrid = Complutense University of Madrid Madrid (UCM); Laboratorio de Genética Molecular; Universidad de Antioquia = University of Antioquia Medellín, Colombia; Archaeogenetics Laboratory; Anthropologie bio-culturelle, Droit, Ethique et Santé (ADES); Aix Marseille Université (AMU)-EFS ALPES MEDITERRANEE-Centre National de la Recherche Scientifique (CNRS); Mère et enfant en milieu tropical : pathogènes, système de santé et transition épidémiologique (MERIT - UMR_D 216); Institut de Recherche pour le Développement (IRD)-Université Paris Descartes - Paris 5 (UPD5); Dept of Genetics, Evolution and Environment London (UCL-GEE); University College of London London (UCL); Laboratoire Cogitamus = Cogitamus Laboratory; Office national de l'eau et des milieux aquatiques (ONEMA); Ministère de l'écologie, du développement durable et de l'énergie; Department of Oncology; ANR-11-LABX-0010,DRIIHM / IRDHEI,Dispositif de recherche interdisciplinaire sur les Interactions Hommes-Milieux(2011)
    • بيانات النشر:
      HAL CCSD
      Elsevier
    • الموضوع:
      2020
    • Collection:
      Muséum National d'Histoire Naturelle (MNHM): HAL
    • نبذة مختصرة :
      International audience ; We developed a new mutationally well-balanced 32 Y-STR multiplex ( CombYplex) together with a machine learning (ML) program PredYMa Le to assess the impact of STR mutability on haplogourp prediction, while respecting forensic community criteria (high DC/HD). We designed CombYplex around two sub-panels M1 and M2 characterized by average and high-mutation STR panels. Using these two sub-panels, we tested how our program PredYmale reacts to mutability when considering basal branches and, moving down, terminal branches. We tested first the discrimination capacity of CombYplex on 996 human samples using various forensic and statistical parameters and showed that its resolution is sufficient to separate haplogroup classes. In parallel, Pred YMa Le was designed and used to test whether a ML approach can predict haplogroup classes from Y-STR profiles. Applied to our kit, SVM and Random Forest classifiers perform very well (average 97 %), better than Neural Network (average 91 %) and Bayesian methods (< 90 %). We observe heterogeneity in haplogroup assignation accuracy among classes, with most haplogroups having high prediction scores (99–100 %) and two (E1b1b and G) having lower scores (67 %). The small sample sizes of these classes explain the high tendency to misclassify the Y-profiles of these haplogroups; results were measurably improved as soon as more training data were added. We provide evidence that our ML approach is a robust method to accurately predict haplogroups when it is combined with a sufficient number of markers, well-balanced mutation rate Y-STR panels, and large ML training sets. Further research on confounding factors (such as CNV-STR or gene conversion) and ideal STR panels in regard to the branches analysed can be developed to help classifiers further optimize prediction scores.
    • Relation:
      info:eu-repo/semantics/altIdentifier/pmid/32818722; hal-02906055; https://hal.science/hal-02906055; https://hal.science/hal-02906055/document; https://hal.science/hal-02906055/file/PIIS1872497320301150-preproof.pdf; PUBMED: 32818722
    • الرقم المعرف:
      10.1016/j.fsigen.2020.102342
    • Rights:
      info:eu-repo/semantics/OpenAccess
    • الرقم المعرف:
      edsbas.DC94909F