Attention via synchrony: making use of multimodal cues in social learning

Rolf, Matthias and Hanheide, Marc and Rohfling, Katharina J. (2009) Attention via synchrony: making use of multimodal cues in social learning. Autonomous Mental Development, IEEE Transactions on, 1 (1). pp. 55-67. ISSN 1943-0604

Documents
Rolf2009-Attention_via_Synchrony_Making_Use_of_Multimodal_Cues_in_Social_Learning.pdf
[img]
[Download]
Request a copy
[img] PDF
Rolf2009-Attention_via_Synchrony_Making_Use_of_Multimodal_Cues_in_Social_Learning.pdf - Whole Document
Restricted to Repository staff only

1MB

Official URL: http://dx.doi.org/10.1109/TAMD.2009.2021091

Abstract

Infants learning about their environment are confronted with many stimuli of different modalities. Therefore, a crucial problem is how to discover which stimuli are related, for instance, in learning words. In making these multimodal ldquobindings,rdquo infants depend on social interaction with a caregiver to guide their attention towards relevant stimuli. The caregiver might, for example, visually highlight an object by shaking it while vocalizing the object's name. These cues are known to help structuring the continuous stream of stimuli. To detect and exploit them, we propose a model of bottom-up attention by multimodal signal-level synchrony. We focus on the guidance of visual attention from audio-visual synchrony informed by recent adult-infant interaction studies. Consequently, we demonstrate that our model is receptive to parental cues during child-directed tutoring. The findings discussed in this paper are consistent with recent results from developmental psychology but for the first time are obtained employing an objective, computational model. The presence of ldquomultimodal mothereserdquo is verified directly on the audio-visual signal. Lastly, we hypothesize how our computational model facilitates tutoring interaction and discuss its application in interactive learning scenarios, enabling social robots to benefit from adult-like tutoring.

Item Type:Article
Additional Information:Infants learning about their environment are confronted with many stimuli of different modalities. Therefore, a crucial problem is how to discover which stimuli are related, for instance, in learning words. In making these multimodal ldquobindings,rdquo infants depend on social interaction with a caregiver to guide their attention towards relevant stimuli. The caregiver might, for example, visually highlight an object by shaking it while vocalizing the object's name. These cues are known to help structuring the continuous stream of stimuli. To detect and exploit them, we propose a model of bottom-up attention by multimodal signal-level synchrony. We focus on the guidance of visual attention from audio-visual synchrony informed by recent adult-infant interaction studies. Consequently, we demonstrate that our model is receptive to parental cues during child-directed tutoring. The findings discussed in this paper are consistent with recent results from developmental psychology but for the first time are obtained employing an objective, computational model. The presence of ldquomultimodal mothereserdquo is verified directly on the audio-visual signal. Lastly, we hypothesize how our computational model facilitates tutoring interaction and discuss its application in interactive learning scenarios, enabling social robots to benefit from adult-like tutoring.
Keywords:Robotics, Human-robot interaction
Subjects:H Engineering > H670 Robotics and Cybernetics
Divisions:College of Science > School of Computer Science
ID Code:6700
Deposited By: Marc Hanheide
Deposited On:26 Oct 2012 09:59
Last Modified:18 Nov 2013 14:17

Repository Staff Only: item control page