Vis enkel innførsel

dc.contributor.authorAhmad, Tasweer
dc.contributor.authorRizvi, Syed Tahir Hussain
dc.contributor.authorKanwal, Neel
dc.date.accessioned2023-08-21T13:46:57Z
dc.date.available2023-08-21T13:46:57Z
dc.date.created2023-07-27T14:32:39Z
dc.date.issued2023-09
dc.identifier.citationAhmad, T., Rizvi, S.T.H., Kanwal, N. (2023) Transforming spatio-temporal self-attention using action embedding for skeleton-based action recognition. Journal of Visual Communication and Image Representation, 95 (103892)en_US
dc.identifier.issn1047-3203
dc.identifier.urihttps://hdl.handle.net/11250/3085109
dc.description.abstractOver the past few years, skeleton-based action recognition has attracted great success because the skeleton data is immune to illumination variation, view-point variation, background clutter, scaling, and camera motion. However, effective modeling of the latent information of skeleton data is still a challenging problem. Therefore, in this paper, we propose a novel idea of action embedding with a self-attention Transformer network for skeleton-based action recognition. Our proposed technology mainly comprises of two modules as, i) action embedding and ii) self-attention Transformer. The action embedding encodes the relationship between corresponding body joints (e.g., joints of both hands move together for performing clapping action) and thus captures the spatial features of joints. Meanwhile, temporal features and dependencies of body joints are modeled using Transformer architecture. Our method works in a single-stream (end-to-end) fashion, where MLP is used for classification. We carry out an ablation study and evaluate the performance of our model on a small-scale SYSU-3D dataset and large-scale NTU-RGB+D and NTU-RGB+D 120 datasets where the results establish that our method performs better than other state-of-the-art architectures.en_US
dc.language.isoengen_US
dc.publisherElsevier Ltd.en_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.titleTransforming spatio-temporal self-attention using action embedding for skeleton-based action recognitionen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionpublishedVersionen_US
dc.rights.holder© 2023 The Author(s).en_US
dc.subject.nsiVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550en_US
dc.source.pagenumber13en_US
dc.source.volume95en_US
dc.source.journalJournal of Visual Communication and Image Representationen_US
dc.identifier.doi10.1016/j.jvcir.2023.103892
dc.identifier.cristin2163804
dc.source.articlenumber103892en_US
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode2


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal