Data Programming for Learning Discourse Structure - Méthodes et Ingénierie des Langues, des Ontologies et du Discours Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Data Programming for Learning Discourse Structure

Résumé

This paper investigates the advantages and limits of data programming for the task of learning discourse structure. The data programming paradigm implemented in the Snorkel framework allows a user to label training data using expert-composed heuristics, which are then transformed via the "generative step" into probability distributions of the class labels given the training candidates. These results are later generalized using a discrimina-tive model. Snorkel's attractive promise to create a large amount of annotated data from a smaller set of training data by unifying the output of a set of heuristics has yet to be used for computationally difficult tasks, such as that of discourse attachment, in which one must decide where a given discourse unit attaches to other units in a text in order to form a coherent discourse structure. Although approaching this problem using Snorkel requires significant modifications to the structure of the heuristics, we show that weak supervision methods can be more than competitive with classical supervised learning approaches to the attachment problem.
Fichier principal
Vignette du fichier
Data_Programing_for_Learning_Discourse_Structure_ACL_2019(1).pdf (136.86 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02393478 , version 1 (04-12-2019)

Identifiants

Citer

Sonia Badene, Kate Thompson, Jean-Pierre Lorré, Nicholas Asher. Data Programming for Learning Discourse Structure. Association for Computational LInguistics (ACL), Jul 2019, Florence, Italy. pp.640-645, ⟨10.18653/v1/P19-1061⟩. ⟨hal-02393478⟩
169 Consultations
167 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More