Vis enkel innførsel

dc.contributor.authorSulaiman, Muhammad
dc.contributor.authorFinnesand, Erik
dc.contributor.authorFarmanbar, Mina
dc.contributor.authorBelbachir, Ahmed Nabil
dc.contributor.authorRong, Chunming
dc.date.accessioned2024-07-09T07:36:21Z
dc.date.available2024-07-09T07:36:21Z
dc.date.created2024-05-23T14:45:15Z
dc.date.issued2024-04
dc.identifier.citationSulaiman, M., Finnesand, E., Farmanbar, M., Belbachir, A. N., & Rong, C. (2024). Building Precision: Efficient Encoder-Decoder Networks for Remote Sensing based on Aerial RGB and LiDAR data. IEEE Access.en_US
dc.identifier.issn2169-3536
dc.identifier.urihttps://hdl.handle.net/11250/3139367
dc.description.abstractPrecision in building delineation plays a pivotal role in population data analysis, city management, policy making, and disaster management. Leveraging computer vision technologies, particularly deep learning models for semantic segmentation, has proven instrumental in achieving accurate automatic building segmentation in remote sensing applications. However, current state-of-the-art (SOTA) techniques are not optimized for precisely extracting building footprints and, specifically, boundaries of the building. This deficiency highlights the need to leverage Light Detection and Ranging (LiDAR) data in conjunction with aerial RGB and streamlined deep learning for improved precision. This work utilizes the MapAI dataset, which includes a variety of objects beyond buildings, such as trees, electricity lines, solar panels, vehicles, and roads. These objects showcase diverse colors and structures, mirroring the rooftops in Denmark and Norway. Due to the aforementioned problems, this study modified UNet and CT-UNet to use LiDAR data and RGB images to segment buildings using Intersection Over Union (IoU) to evaluate building overlap and Boundary Intersection Over Union (BIoU) to evaluate precise building boundaries and shapes. The proposed work changes the configuration of these networks to streamline with LiDAR data for efficient segmentation. The batch data in training is augmented to improve model generalization and overcome overfitting. Batch normalization inclusion also improves overfitting. Four backbones with transfer learning are employed to enhance convergence and parameter efficiency of segmentation: ResNet50V2, DenseNet201, EfficientNetB4, and EfficientNetV2S. Test-Time Augmentation (TTA) is employed to improve the predicted mask. Experiments are performed using single and ensemble models, with and without Augmentation. The ensemble model outperforms the single model, and TTA also improves the results. LiDAR data with RGB improves the combined score (average of IoU and BIoU) by 13.33% compared to only RGB images.en_US
dc.language.isoengen_US
dc.publisherIEEEen_US
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/deed.no*
dc.subjectLight Detection and Ranging (LiDAR) dataen_US
dc.titleBuilding Precision: Efficient Encoder-Decoder Networks for Remote Sensing Based on Aerial RGB and LiDAR Dataen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionpublishedVersionen_US
dc.rights.holder© 2024 The Authorsen_US
dc.subject.nsiVDP::Teknologi: 500en_US
dc.source.pagenumber60329-60346en_US
dc.source.volume12en_US
dc.source.journalIEEE Accessen_US
dc.identifier.doi10.1109/ACCESS.2024.3391416
dc.identifier.cristin2270452
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal