OLAC Record
oai:www.ldc.upenn.edu:LDC2006S33

Metadata
Title:Middle East Technical University Turkish Microphone Speech v 1.0
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Salor, Ozgul, et al. Middle East Technical University Turkish Microphone Speech v 1.0 LDC2006S33. Web Download. Philadelphia: Linguistic Data Consortium, 2006
Contributor:Salor, Ozgul
Ciloglu, Tolga
Pellom, Bryan
Demirekler, Mubeccel
Date (W3CDTF):2006
Date Issued (W3CDTF):2006-05-18
Description:*Introduction* Middle East Technical University Turkish Microphone Speech v 1.0 was developed at Middle East Technical University (METU) and contains text, speech, and alignment files for approximately 5.6 hours of recorded Turkish. The corpus was part of a collaborative work between METU's Department of Electrical and Electronics Engineering and the Center for Spoken Language Research (CSLR) at the University of Colorado at Boulder. The collaboration was supported by TUBITAK, the Scientific and Technical Research Council of Turkey, through a combined doctoral scholarship program. The corpus was used to port CSLR's speech recognition system, SONIC, to Turkish. *Data* The corpus contains text, speech and alignment files. The corpus is of size ~600 MB. 120 speakers (60 male and 60 female) speak 40 sentences each (aproximately 300 words per speaker). The 40 sentences are selected randomly for each speaker from a triphone-balanced set of 2,462 Turkish sentences. The speakers are selected from students, faculty, and staff at METU and all are native speakers of Turkish. The age range is from 19 to 50 years with an average of 23.9 years. The data has been digitally recorded with a Sound Blaster sound card on a PC at a 16 kHz sampling rate. *Samples* For an example of the data in this corpus, please listen to this audio sample (WAV) and view its companion transcript (TXT). *Updates* None at this time.
Extent:Corpus size: 680960 KB
Format:Sampling Rate: 16000
Sampling Format: pcm
Identifier:LDC2006S33
https://catalog.ldc.upenn.edu/LDC2006S33
ISBN: 1-58563-384-4
ISLRN: 461-254-833-604-1
DOI: 10.35111/sk8b-ss58
Language:Turkish
Language (ISO639):tur
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2006S33
Rights Holder:Portions © 2001, 2002, 2005 Middle East Technical University, Tolga Ciloglu, Ozgul Salor, Bryan Pellom, Kadri Hacioglu, Mubeccel Demirekler, © 1993, 2006 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2006S33
DateStamp:  2021-06-04
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Salor, Ozgul; Ciloglu, Tolga; Pellom, Bryan; Demirekler, Mubeccel. 2006. Linguistic Data Consortium.
Terms: area_Asia country_TR dcmi_Sound iso639_tur olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2006S33
Up-to-date as of: Fri Dec 6 7:47:35 EST 2024