Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, Epiciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation
Conference papers

Multitask group Lasso for Genome Wide association Studies in diverse populations

Abstract : Genome-Wide Association Studies, or GWAS, aim at finding Single Nucleotide Polymorphisms (SNPs) that are associated with a phenotype of interest. GWAS are known to suffer from the large dimensionality of the data with respect to the number of available samples. Other limiting factors include the dependency between SNPs, due to linkage disequilibrium (LD), and the need to account for population structure, that is to say, confounding due to genetic ancestry.We propose an efficient approach for the multivariate analysis of multi-population GWAS data based on a multitask group Lasso formulation. Each task corresponds to a subpopulation of the data, and each group to an LD-block. This formulation alleviates the curse of dimensionality, and makes it possible to identify disease LD-blocks shared across populations/tasks, as well as some that are specific to one population/task. In addition, we use stability selection to increase the robustness of our approach. Finally, gap safe screening rules speed up computations enough that our method can run at a genome-wide scale.To our knowledge, this is the first framework for GWAS on diverse populations combining feature selection at the LD-groups level, a multitask approach to address population structure, stability selection, and safe screening rules. We show that our approach outperforms state-of-the-art methods on both a simulated and a real-world cancer datasets.
Complete list of metadata

https://hal-mines-paristech.archives-ouvertes.fr/hal-03510963
Contributor : Chloé-Agathe Azencott Connect in order to contact the contributor
Submitted on : Tuesday, January 4, 2022 - 4:51:31 PM
Last modification on : Monday, January 10, 2022 - 10:16:05 AM

Identifiers

  • HAL Id : hal-03510963, version 1
  • PUBMED : 34890146

Citation

Chloé-Agathe Azencott, Asma Nouira. Multitask group Lasso for Genome Wide association Studies in diverse populations. Pacific Symposium on Biocomputing, Jan 2022, Kohala Coast, Hawaii, United States. pp.163-174. ⟨hal-03510963⟩

Share

Metrics

Record views

35