Claim detection data annotation tool

dc.contributor.advisor	Vinay Jayarama Setty
dc.contributor.author	Nielsen Mats Erik
dc.contributor.author	Martin Erik
dc.date.accessioned	2022-07-22T15:51:13Z
dc.date.available	2022-07-22T15:51:13Z
dc.date.issued	2022
dc.identifier	no.uis:inspera:93568650:49846757
dc.identifier.uri	https://hdl.handle.net/11250/3007824
dc.description	Full text not available
dc.description.abstract
dc.description.abstract	Automatic fact-checking relies on claim detection systems to find claims and estimate their check-worthiness. To improve current claim detection systems, we need high-quality labeled data sets. More specifically, a data set based on claims from general news articles. To our knowledge, no such dataset exists currently. We explore an approach for collecting data for such a set by creating an annotation tool and distributing the work using crowdsourcing platforms. We show that such platforms can be viable, even with complex annotation tasks. We can train participants and test the submitted data quality by developing the right tools and systems. We show that a structured approach to claim definitions using a claim taxonomy can be beneficial when creating a labeling schema. Furthermore, we implement and test a rules-based claim detection system using natural language processing libraries, intending to integrate it into the data collection process.
dc.language	eng
dc.publisher	uis
dc.title	Claim detection data annotation tool
dc.type	Bachelor thesis

Tilhørende fil(er)

Filer	Størrelse	Format	Vis

Studentoppgaver (TN-IDE) [835]
Studentoppgaver i informasjonsteknologi, datateknikk / kybernetikk, signalbehandling