Vis enkel innførsel

dc.contributor.authorMustafa, Ghulam
dc.contributor.authorRauf, Abid
dc.contributor.authorAl-Shamayleh, Ahmad Sami
dc.contributor.authorSulaiman, Muhammad
dc.contributor.authorAfzal, Muhammad Tanvir
dc.contributor.authorAkhunzada, Adnan
dc.date.accessioned2024-04-10T08:56:20Z
dc.date.available2024-04-10T08:56:20Z
dc.date.created2023-11-28T11:35:47Z
dc.date.issued2023
dc.identifier.citationMustafa, G., Rauf, A., Al-Shamayleh, A. S., Sulaiman, M., Alrawagfeh, W., Afzal, M. T., & Akhunzada, A. (2023). Optimizing document classification: Unleashing the power of genetic algorithms. IEEE Access.en_US
dc.identifier.issn2169-3536
dc.identifier.urihttps://hdl.handle.net/11250/3125721
dc.description.abstractMany individuals, including researchers, professors, and students, encounter difficulties when searching for scholarly documents, papers, and journals within a specific domain. Consequently, scholars have begun to focus on document classification problem, offering various methods to address this issue. Researchers have utilized diverse data sources, such as citations, metadata, content, and hybrids, in their approaches.In these sources, the meta-data-based approach stands out for research paper classification due to its availability at no cost. Various scholars have employed different metadata parameters of research articles, including the title, abstract, keywords, and general terms, for research paper classification. In this study, we chose four meta-data-based features such as, title, keyword, abstract, and general terms from the SANTOS dataset, which was prepared by ACM. To represent these features numerically, we employed a semantic-based model called BERT instead of the commonly used count-based models. BERT generates a 768-dimensional vector for each record, which introduces significant time complexity during computation. Additionally, our proposed model optimizes the features using a genetic algorithm. Optimal feature selection performances a crucial role in this domain, enhancing the overall accuracy of the document classification system while reducing the time complexity associated with selecting the most relevant features from this large-dimensional space. For classification purposes, we employed GNB and SVM classifiers. The outcomes of our study exposed that the combination of title and keywords outperformed other combinations.en_US
dc.language.isoengen_US
dc.publisherIEEEen_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.titleOptimizing Document Classification: Unleashing the Power of Genetic Algorithmsen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionpublishedVersionen_US
dc.rights.holderThe authorsen_US
dc.subject.nsiVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550en_US
dc.source.journalIEEE Accessen_US
dc.identifier.doi10.1109/ACCESS.2023.3292248
dc.identifier.cristin2203647
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal