Vis enkel innførsel

dc.contributor.advisorSurbiryala, Jayachander
dc.contributor.authorShahriyari, Salman
dc.date.accessioned2022-11-17T16:51:18Z
dc.date.available2022-11-17T16:51:18Z
dc.date.issued2022
dc.identifierno.uis:inspera:92613534:70079089
dc.identifier.urihttps://hdl.handle.net/11250/3032533
dc.descriptionFull text not available
dc.description.abstractCitations are essential parts of scientific articles and other kinds of texts, and are utilized for different purposes, including validating claims. As a result, finding and locating the suitable places in a text for citations is crucial. This research aims to automate this process by using machine learning and deep learning methods to find sentences worthy of citations. After that, it examines the effect of publication year and the possibility of domain generalization. This research uses a quantitative research method and develops an experimental design to regard the mentioned problems. After some pre-processing steps to create the required labeled dataset, this dissertation first evaluates the best state-of-the-art (SOTA) models and algorithms suitable for the problems. The second step is to experiment with the effect of publication year to include the best quality data for training the models. These analyses prove that recent publications are more suitable to be part of training datasets. As the final step, this research examines the factor of the scientific domain of research. The conventional process is to train, evaluate and test the data considering one field of study as it is supposed to bring the best result. However, training and testing in the same field of study are not always possible because of the unavailability of proper data. This scientific work also explores the possibility of training in one domain and generalizing to another domain. It concludes that some domains are closer to each other and the models created using those domains can be generalized to those similar domains. At the same time, this study suggests that researchers should be more cautious in generalizing the created model of irrelevant domains to each other.
dc.description.abstract
dc.languageeng
dc.publisheruis
dc.titleCite Worthiness Detection; SOTA, and Model Applicability to Other Domains
dc.typeMaster thesis


Tilhørende fil(er)

FilerStørrelseFormatVis

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel