Cite Worthiness Detection; 
SOTA, and Model Applicability to Other Domains

Shahriyari, Salman

dc.contributor.advisor	Surbiryala, Jayachander
dc.contributor.author	Shahriyari, Salman
dc.date.accessioned	2022-11-17T16:51:18Z
dc.date.available	2022-11-17T16:51:18Z
dc.date.issued	2022
dc.identifier	no.uis:inspera:92613534:70079089
dc.identifier.uri	https://hdl.handle.net/11250/3032533
dc.description	Full text not available
dc.description.abstract	Citations are essential parts of scientific articles and other kinds of texts, and are utilized for different purposes, including validating claims. As a result, finding and locating the suitable places in a text for citations is crucial. This research aims to automate this process by using machine learning and deep learning methods to find sentences worthy of citations. After that, it examines the effect of publication year and the possibility of domain generalization. This research uses a quantitative research method and develops an experimental design to regard the mentioned problems. After some pre-processing steps to create the required labeled dataset, this dissertation first evaluates the best state-of-the-art (SOTA) models and algorithms suitable for the problems. The second step is to experiment with the effect of publication year to include the best quality data for training the models. These analyses prove that recent publications are more suitable to be part of training datasets. As the final step, this research examines the factor of the scientific domain of research. The conventional process is to train, evaluate and test the data considering one field of study as it is supposed to bring the best result. However, training and testing in the same field of study are not always possible because of the unavailability of proper data. This scientific work also explores the possibility of training in one domain and generalizing to another domain. It concludes that some domains are closer to each other and the models created using those domains can be generalized to those similar domains. At the same time, this study suggests that researchers should be more cautious in generalizing the created model of irrelevant domains to each other.
dc.description.abstract
dc.language	eng
dc.publisher	uis
dc.title	Cite Worthiness Detection; SOTA, and Model Applicability to Other Domains
dc.type	Master thesis

Tilhørende fil(er)

Filer	Størrelse	Format	Vis

Denne innførselen finnes i følgende samling(er)

Studentoppgaver (TN-IDE) [866]
Studentoppgaver i informasjonsteknologi, datateknikk / kybernetikk, signalbehandling

Vis enkel innførsel

Cite Worthiness Detection; SOTA, and Model Applicability to Other Domains

Tilhørende fil(er)

Denne innførselen finnes i følgende samling(er)