Drug Target Identification with Machine Learning: How to Choose Negative Examples

Matthieu Najm; Chloé-Agathe Azencott; Benoit Playe; Véronique Stoven

doi:10.3390/ijms22105118

Article Dans Une Revue International Journal of Molecular Sciences Année : 2021

Drug Target Identification with Machine Learning: How to Choose Negative Examples

(1) , (1) , (1) , (1)

Matthieu Najm

Fonction : Auteur
PersonId : 797889
ORCID : 0000-0001-9757-1499

Centre de Bioinformatique

Chloé-Agathe Azencott

Fonction : Auteur
PersonId : 11475
IdHAL : chloe-agathe-azencott
ORCID : 0000-0003-1003-301X
IdRef : 195762959

Centre de Bioinformatique

Benoit Playe

Fonction : Auteur
PersonId : 793808
ORCID : 0000-0003-3836-2683

Centre de Bioinformatique

Véronique Stoven

Fonction : Auteur
PersonId : 12583
IdHAL : veronique-stoven
ORCID : 0000-0003-0828-0759
IdRef : 149662025

Centre de Bioinformatique

Résumé

Identification of the protein targets of hit molecules is essential in the drug discovery process. Target prediction with machine learning algorithms can help accelerate this search, limiting the number of required experiments. However, Drug-Target Interactions databases used for training present high statistical bias, leading to a high number of false positives, thus increasing time and cost of experimental validation campaigns. To minimize the number of false positives among predicted targets, we propose a new scheme for choosing negative examples, so that each protein and each drug appears an equal number of times in positive and negative examples. We artificially reproduce the process of target identification for three specific drugs, and more globally for 200 approved drugs. For the detailed three drug examples, and for the larger set of 200 drugs, training with the proposed scheme for the choice of negative examples improved target prediction results: the average number of false positives among the top ranked predicted targets decreased, and overall, the rank of the true targets was improved.Our method corrects databases’ statistical bias and reduces the number of false positive predictions, and therefore the number of useless experiments potentially undertaken.

Domaines

Machine Learning [stat.ML] Bio-informatique [q-bio.QM]

Chloé-Agathe Azencott : Connectez-vous pour contacter le contributeur

https://minesparis-psl.hal.science/hal-03359024

Soumis le : mercredi 29 septembre 2021-18:22:43

Dernière modification le : vendredi 19 avril 2024-16:18:56

Dates et versions

hal-03359024 , version 1 (29-09-2021)

Identifiants

HAL Id : hal-03359024 , version 1
DOI : 10.3390/ijms22105118
PUBMEDCENTRAL : PMC8151112

Citer

Matthieu Najm, Chloé-Agathe Azencott, Benoit Playe, Véronique Stoven. Drug Target Identification with Machine Learning: How to Choose Negative Examples. International Journal of Molecular Sciences, 2021, 22 (10), pp.5118. ⟨10.3390/ijms22105118⟩. ⟨hal-03359024⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM ENSMP ENSMP_CBIO PARISTECH PSL ENSMP_DR ANR PRAIRIE-IA

29 Consultations

0 Téléchargements

Drug Target Identification with Machine Learning: How to Choose Negative Examples

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager