Vis enkel innførsel

dc.contributor.advisorFarmanbar, Mina
dc.contributor.advisorSulaiman , Muhammad
dc.contributor.authorLi, Rongbing
dc.contributor.authorKoloszyc, Piotr
dc.date.accessioned2023-09-07T15:51:20Z
dc.date.available2023-09-07T15:51:20Z
dc.date.issued2023
dc.identifierno.uis:inspera:129718883:111453655
dc.identifier.urihttps://hdl.handle.net/11250/3087984
dc.description.abstractAssigning class labels to instances is a key component of the machine learning technique known as classification predictive modeling. While concentrating largely on balanced classification problems, which are thought to be the easiest type, the prevalent models and assessment metrics used in classification learning assume an equal distribution of data across class labels. Many machine learning algorithms fail when the distribution of instances among classes is unbalanced, and the assessment measures used, including classification accuracy, become dangerously misleading. Numerous real-world issues, including as fraud detection, churn prediction, medical diagnosis, and many more, frequently include imbalanced class distributions. In fact, it is frequently more frequent to find unbalanced courses than balanced ones, emphasizing how important it is to solve this problem. This thesis primarily investigates innovative strategies for managing imbalanced data. One of the approaches examined is the utilization of the Majority and Minority repositioning Technique (MaMiPot) algorithms in combination with different variations of SMOTE and the application of K-means clustering before repositioning. Another method emphasized in this research is the implementation of Generative Adversarial Networks (GAN), a neural network-based technique designed for addressing imbalanced data issues. The evaluation of these approaches was performed on 25 imbalanced datasets obtained from the KEEL repository, encompassing various levels of class imbalance ratios spanning from 5.14 to 129.44. To assess the performance of the proposed method in mitigating the class imbalance problem, several evaluation metrics were utilized. These metrics include F-score, G- mean, and AUC, which provide valuable insights into the effectiveness of the approach in improving classification results and addressing the challenges posed by imbalanced datasets.
dc.description.abstract
dc.languageeng
dc.publisheruis
dc.titleComparative Analysis of Sampling Methods for Imbalanced Classification
dc.typeMaster thesis


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel