Automatic semantic video annotation in wide domain videos based on similarity and commonsense knowledgebases

Altadmri, Amjad and Ahmed, Amr (2009) Automatic semantic video annotation in wide domain videos based on similarity and commonsense knowledgebases. In: The IEEE International Conference on Signal and Image Processing Applications (ICSIPA 2009), 18-19th November 2009, Malaysia.

Documents
Automatic semantic video annotation in wide domain videos based on similarity and commonsense knowledgebases
[img]
[Download]
[img]
Preview
PDF
Semantic_Video_Annotation.pdf

835kB
Item Type:Conference or Workshop contribution (Paper)
Item Status:Live Archive

Abstract

In this paper, we introduce a novel framework for automatic Semantic Video Annotation. As this framework detects possible events occurring in video clips, it forms the annotating base of video search engine. To achieve this purpose, the system has to able to operate on uncontrolled wide-domain videos. Thus, all layers have to be based on generic features.

This framework aims to bridge the "semantic gap", which is the difference between the low-level visual features and the human's perception, by finding videos with similar visual events, then analyzing their free text annotation to find a common area then to decide the best description for this new video using commonsense knowledgebases.

Experiments were performed on wide-domain video clips from the TRECVID 2005 BBC rush standard database. Results from these experiments show promising integrity between those two layers in order to find expressing annotations for the input video. These results were evaluated based on retrieval performance.

Additional Information:In this paper, we introduce a novel framework for automatic Semantic Video Annotation. As this framework detects possible events occurring in video clips, it forms the annotating base of video search engine. To achieve this purpose, the system has to able to operate on uncontrolled wide-domain videos. Thus, all layers have to be based on generic features. This framework aims to bridge the "semantic gap", which is the difference between the low-level visual features and the human's perception, by finding videos with similar visual events, then analyzing their free text annotation to find a common area then to decide the best description for this new video using commonsense knowledgebases. Experiments were performed on wide-domain video clips from the TRECVID 2005 BBC rush standard database. Results from these experiments show promising integrity between those two layers in order to find expressing annotations for the input video. These results were evaluated based on retrieval performance.
Keywords:Semantic Video Annotation, Video Indexing, Video Retrieval, video search engine, semantic gap, uncontrolled videos, Content based, Commonsense Knowledgebases, visual events, free text annotation, Event Detection, Wide Domain Videos, Similarity, Automatic Semantic Video Annotation, Video Annotation, Video Information Retrieval, Content based video retrieval, uncontrolled wide-domain videos, generic features, low-level visual features, human's perception, wide-domain video clips, TRECVID, TRECVID BBC rush, standard video database, retrieval performance.
Subjects:G Mathematical and Computer Sciences > G700 Artificial Intelligence
G Mathematical and Computer Sciences > G710 Speech and Natural Language Processing
G Mathematical and Computer Sciences > G400 Computer Science
G Mathematical and Computer Sciences > G720 Knowledge Representation
G Mathematical and Computer Sciences > G450 Multi-media Computing Science
G Mathematical and Computer Sciences > G740 Computer Vision
G Mathematical and Computer Sciences > G540 Databases
Divisions:College of Science > School of Computer Science
ID Code:2044
Deposited On:04 Nov 2009 16:50

Repository Staff Only: item control page