OLAC Record
oai:www.ldc.upenn.edu:LDC2014S02

Metadata
Title:King Saud University Arabic Speech Database
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Alsulaiman, Mansour, et al. King Saud University Arabic Speech Database LDC2014S02. Web Download. Philadelphia: Linguistic Data Consortium, 2014
Contributor:Alsulaiman, Mansour
Muhammad, Ghulam
Abdelkader, Bencherif Mohamed
Mahmood, Awais
Ali, Zulfiqar
Date (W3CDTF):2014
Date Issued (W3CDTF):2014-02-17
Description:*Introduction* King Saud University Arabic Speech Database was developed by Speech Group (SG) at King Saud University and contains 590 hours of recorded Arabic speech from 269 male and female speakers. The utterances include read and spontaneous speech. The recordings were conducted in varied environments representing quiet and noisy settings. *Data* The corpus was designed principally for speaker recognition research. However, other possible applications include first language recognition, mobile effect, multichannel effect, and use of different type of microphones. The speech sources are word lists, sentence lists, paragraphs and question and answer sessions. Read speech text includes the following: * Sets of sentences devised to cover allophones of each phoneme, phonetic balance, and differentiation of accents. * Word lists developed to minimize missing phonemes and to represent nasals fricatives, commonly used words, and numbers. * Two paragraphs selected because they included all letters of the alphabet and were easy to read. Spontaneous speech was captured through question and answer sessions where speakers answer questions displayed on screen. The questions were on general topics such as the weather and food and included the speaker name or number. The speakers were Saudis and non-Saudis. Among the non-Saudi participants were Arabs and non-Arabs. All female speakers were either Saudis or non-Saudi Arabs. Male speakers included non-Arabs from the Indian subcontinent, Africa, South East Asia and East Europe. Non-Arab participants were required to be able to read Arabic at an acceptable level. Most of the Non-Arab speakers were from the fourth level in the Arabic Linguistics Institute at King Saud University. The non-Saudi participants represented 28 nationalities and were chosen from clusters of areas or countries. Each speaker was recorded in three different environments: in a soundproof room , in an office and in a cafeteria. The recordings were collected via different microphones and a mobile phone and averaged between 16-19 minutes. The recordings were done in three sessions with a time-gap of an approximately 6 weeks. The data was verified for missing recordings, problems with the recording system or errors in the recording process. All files are presented as two channel 48 kHz 16-bit FLAC compressed PCM wav files. Note that sizes and file names in the documentation are for the uncompressed wav files. *Samples* Please view this male sample and female sample. *Updates* None at this time.
Extent:Corpus size: 148897792 KB
Format:Sampling Rate: 48000
Sampling Format: pcm
Identifier:LDC2014S02
https://catalog.ldc.upenn.edu/LDC2014S02
ISBN: 1-58563-669-X
ISLRN: 789-673-729-277-5
DOI: 10.35111/vpqe-bz17
Language:Arabic
Language (ISO639):ara
License:King Saud University Arabic Speech Database: https://catalog.ldc.upenn.edu/license/ksu-arabic-speech-database.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2014S02
Rights Holder:Portions © 2014 King Saud University, © 2014 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2014S02
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Alsulaiman, Mansour; Muhammad, Ghulam; Abdelkader, Bencherif Mohamed; Mahmood, Awais; Ali, Zulfiqar. 2014. Linguistic Data Consortium.
Terms: dcmi_Sound iso639_ara olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2014S02
Up-to-date as of: Fri Dec 6 7:48:15 EST 2024