OLAC Record: TRAD Chinese-French Parallel Text -- Blog

OLAC Record
oai:www.ldc.upenn.edu:LDC2018T02

Metadata

Title: TRAD Chinese-French Parallel Text -- Blog

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: Linguistic Data Consortium, and ELDA. TRAD Chinese-French Parallel Text -- Blog LDC2018T02. Web Download. Philadelphia: Linguistic Data Consortium, 2018

Contributor: Linguistic Data Consortium

ELDA

Date (W3CDTF): 2018

Date Issued (W3CDTF): 2018-01-16

Description: *Introduction* TRAD Chinese-French Parallel Text -- Blog was developed by ELDA as part of the PEA-TRAD project. It contains French translations of a subset of approximately 10,000 Chinese words from GALE Phase 1 Chinese Blog Parallel Text (LDC2008T06). The PEA-TRAD project (Translation as a Support for Document Analysis) was supported by the French Ministry of Defense (DGA). Its purpose was to develop speech-to-speech translation technology for multiple languages (e.g., Arabic, Chinese, Pashto) from a variety of domains. ELDA developed several corpora for this effort. The Linguistic Data Consortium (LDC) has also released the following TRAD corpora: * TRAD Arabic-French Parallel Text -- Newsgroup (LDC2018T13) * TRAD Chinese-French Parallel Text -- Broadcast News (LDC2018T17) * TRAD Arabic-French Parallel Text -- Newswire (LDC2018T21) *Data* This release consists of 444 segments (translation units) from 17 documents. The source data is Chinese blog text collected and translated into English by LDC for the DARPA GALE (Global Autonomous Language Exploitation) program. Information about the ELDA translation team, translation guidelines and validation results is contained in the documentation accompanying this release. The Chinese source file contains 15,809 characters and the French reference translation contains 11,769 words. The data is presented in two unicode-encoded XML files along with an associated DTD. *Samples* Please view this source sample and reference sample. *Updates* None at this time.

Extent: Corpus size: 1048 KB

Identifier: LDC2018T02

https://catalog.ldc.upenn.edu/LDC2018T02

ISBN: 1-58563-830-7

ISLRN: 713-266-631-883-0

DOI: 10.35111/n41t-3944

Language: Mandarin Chinese

French

Language (ISO639): cmn

fra

License: TRAD Chinese-French Parallel Text – Blog Agreement (For-Profit): https://catalog.ldc.upenn.edu/license/trad-chinese-french-parallel-text-blog-agreement-for-profit.pdf

TRAD Chinese-French Parallel Text – Blog Agreement (Non-Member): https://catalog.ldc.upenn.edu/license/trad-chinese-french-parallel-text-blog-agreement-non-member.pdf

TRAD Chinese-French Parallel Text – Blog Agreement (Not-For-Profit): https://catalog.ldc.upenn.edu/license/trad-chinese-french-parallel-text-blog-agreement-not-for-profit.pdf

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC2018T02

Rights Holder: Portions © 2018 ELDA, © 2005-2007, 2008, 2018 Trustees of the University of Pennsylvania

Type (DCMI): Text

Type (OLAC): primary_text

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC2018T02

DateStamp: 2020-11-30

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Linguistic Data Consortium; ELDA. 2018. Linguistic Data Consortium.
Terms: area_Asia area_Europe country_CN country_FR dcmi_Text iso639_cmn iso639_fra olac_primary_text

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2018T02
Up-to-date as of: Wed Oct 29 7:01:46 EDT 2025

Metadata
Title:		TRAD Chinese-French Parallel Text -- Blog
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		Linguistic Data Consortium, and ELDA. TRAD Chinese-French Parallel Text -- Blog LDC2018T02. Web Download. Philadelphia: Linguistic Data Consortium, 2018
Contributor:		Linguistic Data Consortium
Contributor:		ELDA
Date (W3CDTF):		2018
Date Issued (W3CDTF):		2018-01-16
Description:		Introduction TRAD Chinese-French Parallel Text -- Blog was developed by ELDA as part of the PEA-TRAD project. It contains French translations of a subset of approximately 10,000 Chinese words from GALE Phase 1 Chinese Blog Parallel Text (LDC2008T06). The PEA-TRAD project (Translation as a Support for Document Analysis) was supported by the French Ministry of Defense (DGA). Its purpose was to develop speech-to-speech translation technology for multiple languages (e.g., Arabic, Chinese, Pashto) from a variety of domains. ELDA developed several corpora for this effort. The Linguistic Data Consortium (LDC) has also released the following TRAD corpora: * TRAD Arabic-French Parallel Text -- Newsgroup (LDC2018T13) * TRAD Chinese-French Parallel Text -- Broadcast News (LDC2018T17) * TRAD Arabic-French Parallel Text -- Newswire (LDC2018T21) Data This release consists of 444 segments (translation units) from 17 documents. The source data is Chinese blog text collected and translated into English by LDC for the DARPA GALE (Global Autonomous Language Exploitation) program. Information about the ELDA translation team, translation guidelines and validation results is contained in the documentation accompanying this release. The Chinese source file contains 15,809 characters and the French reference translation contains 11,769 words. The data is presented in two unicode-encoded XML files along with an associated DTD. Samples Please view this source sample and reference sample. Updates None at this time.
Extent:		Corpus size: 1048 KB
Identifier:		LDC2018T02
		https://catalog.ldc.upenn.edu/LDC2018T02
		ISBN: 1-58563-830-7
		ISLRN: 713-266-631-883-0
		DOI: 10.35111/n41t-3944
Language:		Mandarin Chinese
Language:		French
Language (ISO639):		cmn
Language (ISO639):		fra
License:		TRAD Chinese-French Parallel Text – Blog Agreement (For-Profit): https://catalog.ldc.upenn.edu/license/trad-chinese-french-parallel-text-blog-agreement-for-profit.pdf
		TRAD Chinese-French Parallel Text – Blog Agreement (Non-Member): https://catalog.ldc.upenn.edu/license/trad-chinese-french-parallel-text-blog-agreement-non-member.pdf
		TRAD Chinese-French Parallel Text – Blog Agreement (Not-For-Profit): https://catalog.ldc.upenn.edu/license/trad-chinese-french-parallel-text-blog-agreement-not-for-profit.pdf
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC2018T02
Rights Holder:		Portions © 2018 ELDA, © 2005-2007, 2008, 2018 Trustees of the University of Pennsylvania
Type (DCMI):		Text
Type (OLAC):		primary_text
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC2018T02
DateStamp:		2020-11-30
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Linguistic Data Consortium; ELDA. 2018. Linguistic Data Consortium.
Terms:		area_Asia area_Europe country_CN country_FR dcmi_Text iso639_cmn iso639_fra olac_primary_text