Vis enkel innførsel

dc.contributor.advisorSetty, Vinay Jayarama
dc.contributor.authorLien, Audun Stjernelund
dc.date.accessioned2023-09-19T15:51:20Z
dc.date.available2023-09-19T15:51:20Z
dc.date.issued2023
dc.identifierno.uis:inspera:129718883:50794433
dc.identifier.urihttps://hdl.handle.net/11250/3090545
dc.description.abstractThis master thesis focuses on developing an automatic approach to detect corporate greenwashing. To achieve this, data must be collected, and green claims found from this data must be fact checked. The first step is to collect data by scraping. The web scrapers in this thesis were designed to extract comprehensive information about companies from their websites and reports using two datasets as benchmarks. The Fauna dataset was scraped using a recursive web scraper that extracted data from sub-pages linked to each company’s website. The CICERO Shades of Green dataset was scraped using a scraper that visited each link in the dataset to extract the text from each report made by CICERO. The collected datasets underwent preprocessing to ensure compatibility with machine learning models. The texts scraped from the Fauna dataset were often excessively long due to the abundance of information on the websites. These texts were summarized using a Transformer model, and irrelevant texts were manually removed from the dataset. In the case of the Cicero dataset, text augmentation was applied to expand the dataset and investigate its impact on model performance. To address the limited data availability, transfer-learning techniques including zero, one, and two-shot learning were applied to both the Fauna and Cicero datasets. These techniques leverage pre-trained models to learn from a small amount of labeled data. Additionally, fine-tuned models were implemented specifically for the Cicero dataset to provide a basis for comparison. The trained models achieved superior performance to the transfer-learning models, suggesting that training large models with limited training data remains an effective approach.
dc.description.abstract
dc.languageeng
dc.publisheruis
dc.titleMachine learning to detect corporate greenwashing
dc.typeMaster thesis


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel