Vis enkel innførsel

dc.contributor.authorMårtensson, Gustav
dc.contributor.authorFerreira, Daniel
dc.contributor.authorGranberg, Tobias
dc.contributor.authorCavallin, Lena
dc.contributor.authorOppedal, Ketil
dc.contributor.authorPadovani, Alessandro
dc.contributor.authorRektorova, Irena
dc.contributor.authorBonanni, Laura
dc.date.accessioned2021-03-15T14:39:54Z
dc.date.available2021-03-15T14:39:54Z
dc.date.created2020-11-06T14:56:06Z
dc.date.issued2020-12
dc.identifier.citationMårtensson, G., Ferreira, D., Granberg, T. et al. (2020) The reliability of a deep learning model in clinical out-of-distribution MRI data: A multicohort study. Medical Image Analysis, 66, 101714.en_US
dc.identifier.issn1361-8415
dc.identifier.urihttps://hdl.handle.net/11250/2733477
dc.description.abstractDeep learning (DL) methods have in recent years yielded impressive results in medical imaging, with the potential to function as clinical aid to radiologists. However, DL models in medical imaging are often trained on public research cohorts with images acquired with a single scanner or with strict protocol harmonization, which is not representative of a clinical setting. The aim of this study was to investigate how well a DL model performs in unseen clinical datasets–collected with different scanners, protocols and disease populations–and whether more heterogeneous training data improves generalization. In total, 3117 MRI scans of brains from multiple dementia research cohorts and memory clinics, that had been visually rated by a neuroradiologist according to Scheltens’ scale of medial temporal atrophy (MTA), were included in this study. By training multiple versions of a convolutional neural network on different subsets of this data to predict MTA ratings, we assessed the impact of including images from a wider distribution during training had on performance in external memory clinic data. Our results showed that our model generalized well to datasets acquired with similar protocols as the training data, but substantially worse in clinical cohorts with visibly different tissue contrasts in the images. This implies that future DL studies investigating performance in out-of-distribution (OOD) MRI data need to assess multiple external cohorts for reliable results. Further, by including data from a wider range of scanners and protocols the performance improved in OOD data, which suggests that more heterogeneous training data makes the model generalize better. To conclude, this is the most comprehensive study to date investigating the domain shift in deep learning on MRI data, and we advocate rigorous evaluation of DL models on clinical data prior to being certified for deployment.en_US
dc.language.isoengen_US
dc.publisherElsevier Ltd.en_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.subjectMRIen_US
dc.titleThe reliability of a deep learning model in clinical out-of-distribution MRI data: A multicohort studyen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionpublishedVersionen_US
dc.rights.holder© 2020 The Author(s).en_US
dc.subject.nsiVDP::Teknologi: 500::Medisinsk teknologi: 620en_US
dc.source.pagenumber10en_US
dc.source.volume66en_US
dc.source.journalMedical Image Analysisen_US
dc.identifier.doi10.1016/j.media.2020.101714
dc.identifier.cristin1845697
dc.source.articlenumber101714en_US
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal