Abdal Hafeth, Deema, Kollias, Stefanos and Ghafoor, Mubeen (2023) Semantic Representations with Attention Networks for Boosting Image Captioning. IEEE Access, 11 . pp. 40230-40239. ISSN 2169-3536
Full content URL: https://doi.org/10.1109/ACCESS.2023.3268744
Documents |
|
|
PDF
Semantic_Representations_with_Attention_Networks_for_Boosting_Image_Captioning (2).pdf - Whole Document Available under License Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International. 924kB |
Item Type: | Article |
---|---|
Item Status: | Live Archive |
Abstract
Image captioning has shown encouraging outcomes with Transformer-based architectures
that typically use attention-based methods to establish semantic associations between objects in an
image for caption prediction. Nevertheless, when appearance features of objects in an image display low
interdependence, attention-based methods have difficulty in capturing the semantic association between
them. To tackle this problem, additional knowledge beyond the task-specific dataset is often required
to create captions that are more precise and meaningful. In this article, a semantic attention network is
proposed to incorporate general-purpose knowledge into a transformer attention block model. This design
combines visual and semantic properties of internal image knowledge in one place for fusion, serving as
a reference point to aid in the learning of alignments between vision and language and to improve visual
attention and semantic association. The proposed framework is validated on the Microsoft COCO dataset,
and experimental results demonstrate competitive performance against the current state of the art.
Keywords: | attention, image captioning, transformers, semantic features, knowledge base |
---|---|
Subjects: | G Mathematical and Computer Sciences > G760 Machine Learning G Mathematical and Computer Sciences > G740 Computer Vision |
Divisions: | College of Science > School of Computer Science |
ID Code: | 54257 |
Deposited On: | 30 May 2023 14:03 |
Repository Staff Only: item control page