OLAC Record
oai:www.ldc.upenn.edu:LDC96S31

Metadata
Title:CSR-IV HUB4
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Garofolo, John S., et al. CSR-IV HUB4 LDC96S31. Web Download. Philadelphia: Linguistic Data Consortium, 1996
Contributor:Garofolo, John S.
Fiscus, Jonathan G.
Fisher, William
Pallett, David
Date (W3CDTF):1996
Description:This release contains all of the speech data provided to sites participating in the DARPA CSR November 1995 HUB4 (Radio) Broadcast News tests. The data consists of digitized waveforms of MarketPlace (tm) business news radio shows provided by KUSC through an agreement with the Linguistic Data Consortium and detailed transcriptions of those broadcasts. The software NIST used to process and score the output of the test systems is also included. The data is organized as follows: CD26-1: Training Data-Ten complete half-hour broadcasts with minimal-verified transcripts. The transcripts are time aligned with the waveforms at the story-boundary level. CD26-2: Development-Test Data-Six complete half-hour broadcasts with verified transcripts. The transcripts are time aligned with the waveforms at the story- and turn-boundary level. Index files have been included which specify how the data may be partitioned into 2 test sets. CD26-6 Evaluation-Test Data-Five complete half-hour broadcasts with verified/adjudicated transcripts. The transcripts are time aligned with the waveforms at the story-, turn- and music-boundary level. An index file has been included which specifies how the data was partitioned into the test set used in the CSR 1995 HUB4 tests. *Samples* * Audio * Transcripts * Speaker
Format:Sampling Rate: 16000
Sampling Format: 1-channel pcm
Identifier:LDC96S31
https://catalog.ldc.upenn.edu/LDC96S31
ISBN: 1-58563-087-X
ISLRN: 440-074-007-959-3
DOI: 10.35111/xqjm-hp63
Language:English
Language (ISO639):eng
License:KUSC Radio Broadcast News Agreement: https://catalog.ldc.upenn.edu/license/kusc-radio-broadcast-news.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC96S31
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC96S31
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Garofolo, John S.; Fiscus, Jonathan G.; Fisher, William; Pallett, David. 1996. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC96S31
Up-to-date as of: Tue May 7 7:24:31 EDT 2024