dc.contributor.advisor | Setty, Vinay | |
dc.contributor.author | Fossåen, Nils Magne | |
dc.date.accessioned | 2020-09-27T18:21:50Z | |
dc.date.available | 2020-09-27T18:21:50Z | |
dc.date.issued | 2020-07-15 | |
dc.identifier.uri | https://hdl.handle.net/11250/2679785 | |
dc.description | Master's thesis in Computer science | en_US |
dc.description.abstract | Semi-supervised learning defines the techniques that fall in between supervised and unsupervised learning. It is commonly used in classification settings where one has a lesser amount of labeled data compared to unlabeled. The goal is to extract extra learning from the unlabeled data to improve on the supervised classification.
We will explore some of the approaches to semi-supervised learning to improve on the classification of Nordic news articles in the corpus provided. We will be exploring the methods of self-training in several different configurations and methods of feature extraction and engineering.
We will also provide some background and baseline using common supervised methods for improving results as well as different document representations like word-embedding so that we will be able to compare and put our semi-supervised results in relation to these methods.
We will see that while some of the methods explored did not succeed, others did and in relation to some of the supervised methods their performance is comparable. We will also see some promising approaches for countering the imbalance problem when considering confident pseudo-labels. | en_US |
dc.language.iso | eng | en_US |
dc.publisher | University of Stavanger, Norway | en_US |
dc.relation.ispartofseries | Masteroppgave/UIS-TN-IDE/2020; | |
dc.rights | Navngivelse-Ikkekommersiell-DelPåSammeVilkår 4.0 Internasjonal | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/deed.no | * |
dc.subject | informasjonsteknologi | en_US |
dc.subject | semi-supervised | en_US |
dc.subject | classification | en_US |
dc.title | Semi-supervised learning for classification of Nordic news articles | en_US |
dc.type | Master thesis | en_US |
dc.subject.nsi | VDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550 | en_US |