Al-Diri, Bashir and Sharieh, Ahmad and Hudaib, Taghread (2004) An Arabic speech corpus: a database for Arabic speech recognition. Dirasat: Pure Sciences, 31 (2). pp. 208-219. ISSN 1560-456x
Full text not available from this repository. (Request a copy)Abstract
This paper describes the creation of new Arabic Speech Corpus (ASC) for Large Vocabulary Continuous Speech Recognition (LVCSR) technology. It describes the steps of creation and the process of recording the database used for the ASC. The ASC is designed to be comparable to corpora of other natural languages. The corpus contains 4740 utterances from Six speakers (three males and three females). There are 620 statements for training and 171 statements for testing and evaluation for each speaker. There are 3622 words, with 27725 triphones, where 5034 of them are unique.
| Item Type: | Article |
|---|---|
| Additional Information: | This paper describes the creation of new Arabic Speech Corpus (ASC) for Large Vocabulary Continuous Speech Recognition (LVCSR) technology. It describes the steps of creation and the process of recording the database used for the ASC. The ASC is designed to be comparable to corpora of other natural languages. The corpus contains 4740 utterances from Six speakers (three males and three females). There are 620 statements for training and 171 statements for testing and evaluation for each speaker. There are 3622 words, with 27725 triphones, where 5034 of them are unique. |
| Keywords: | Acoustic, Arabic Speech, Continuous, corpus, speech recognition, triphone |
| Subjects: | G Mathematical and Computer Sciences > G400 Computer Science |
| Divisions: | College of Sciences > Faculty of Science > Lincoln School of Computer Science |
| Depositing User: | Rosaline Smith |
| Date Deposited: | 12 Nov 2010 10:12 |
| Last Modified: | 18 Jul 2011 16:34 |
| URI: | http://eprints.lincoln.ac.uk/id/eprint/3610 |
Actions (login required)
![]() |
View Item |
