Making sense of nonsense : Integrated gradient-based input reduction to improve recall for check-worthy claim detection

Sheikhi, Ghazaal; Opdahl, Andreas Lothe; Touileb, Samia; Setty, Vinay

Sheikhi, Ghazaal; Opdahl, Andreas Lothe; Touileb, Samia; Setty, Vinay

Chapter

Published version

Åpne

2177670_OA_Setty.pdf (1.082Mb)

Permanent lenke

https://hdl.handle.net/11250/3103449

Utgivelsesdato

2023

Metadata

Vis full innførsel

Samlinger

Publikasjoner fra CRIStin [4377]
Vitenskapelige publikasjoner (TN-IDE) [251]

Originalversjon

Sheikhi, G., Opdahl, A. L., Touileb, S., & Setty, V. (2023). Making Sense of Nonsense: Integrated Gradient-based Input Reduction to I. Proceedings of the 5th Symposium of the Norwegian AI Society (NAIS 2023)mprove Recall for Check-worthy Claim Detection. I .

Sammendrag

Analysing long text documents of political discourse to identify check-worthy claims (claim detection) is known to be an important task in automated fact-checking systems, as it saves the precious time of fact-checkers, allowing for more fact-checks. However, existing methods use black-box deep neural NLP models to detect check-worthy claims, which limits the understanding of the model and the mistakes they make. The aim of this study is therefore to leverage an explainable neural NLP method to improve the claim detection task. Specifically, we exploit well known integrated gradient-based input reduction on textCNN and BiLSTM to create two different reduced claim data sets from ClaimBuster. We observe that a higher recall in check-worthy claim detection is achieved on the data reduced by BiLSTM compared to the models trained on claims. This is an important remark since the cost of overlooking check-worthy claims is high in claim detection for fact-checking. This is also the case when a pre-trained BERT sequence classification model is fine-tuned on the reduced data set. We argue that removing superfluous tokens using explainable NLP could unlock the true potential of neural language models for claim detection, even though the reduced claims might make no sense to humans. Our findings provide insights on task formulation, design of annotation schema and data set preparation for check-worthy claim detection.

Making sense of nonsense : Integrated gradient-based input reduction to improve recall for check-worthy claim detection

Utgiver

Technical University of Aachen

Serie

CEUR Workshop Proceedings;

Opphavsrett

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal