OLAC Record oai:www.ldc.upenn.edu:LDC2024T11 |
Metadata | ||
Title: | Abstract Meaning Representation 3.0 - Machine Translations | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Vanroy, Bram. Abstract Meaning Representation 3.0 - Machine Translations LDC2024T11. Web Download. Philadelphia: Linguistic Data Consortium, 2024 | |
Contributor: | Vanroy, Bram | |
Date (W3CDTF): | 2024 | |
Date Issued (W3CDTF): | 2024-12-16 | |
Description: | *Introduction* Abstract Meaning Representation 3.0 - Machine Translations was developed by the Center for Computational Linguistics at KU Leuven in the HORIZON2020 project SignON. It is an automatic translation of a subset of sentences from Abstract Meaning Representation (AMR) Annotation Release 3.0 (LDC2020T02) into Spanish, Irish Gaelic, and Dutch. AMR 3.0 is a semantic treebank of over 59,255 English natural language sentences from broadcast conversations, newswire, weblogs, web discussion forums, fiction and web text. *Data* The source sentences were drawn from material collected by the Linguistic Data Consortium, specifically, discussion forum text from the DARPA BOLT and DARPA DEFT programs, transcripts and English translations of Mandarin Chinese broadcast news programming, Wall Street Journal text, translated Xinhua news texts, various newswire texts from NIST OpenMT evaluations and weblog data from the DARPA GALE program. AMR 3.0 training, development and test splits were translated into Spanish, Irish Gaelic, and Dutch using Google Translate. "Unsplit" directories were not translated and are not included in this release. Translations were not manually verified, but formal issues (such as unexpected new lines) were corrected, and special tokens and encoding issues were fixed with the Python tool ftfy.fix_text. Data is presented in UTF-8 encoded txt files in PENMAN format. *Samples* Please view this text sample (TXT). *Updates* None at this time. | |
Extent: | Corpus size: 54116 KB | |
Identifier: | LDC2024T11 | |
https://catalog.ldc.upenn.edu/LDC2024T11 | ||
ISLRN: 737-010-881-982-1 | ||
DOI: 10.35111/b94n-1y25 | ||
Language: | Dutch | |
Spanish | ||
Irish | ||
Language (ISO639): | nld | |
spa | ||
gle | ||
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2024T11 | |
Rights Holder: | Portions © 2024 KU Leuven, © 1994-1996, 2002-2010 Agence France Presse, © 2007 Al-Ahram, © 2007 Al Hayat, © 2007 Al-Quds Al-Arabi, © 2000 American Broadcasting Company, © 2007 An Nahar, © 2007 Asharq Al-Awsat, © 2007 Assabah, © 2002-2008, 2010 The Associated Press, © 2000 Cable News Network LP, LLLP, © 2003-2004, 2007-2008 Central News Agency (Taiwan), © 1997, 2004-2007 China Central TV, © 2007 China Military Online, © 2007 Chinanews.com, © 1987-1989 Dow Jones & Company, Inc., © 2007 Guangming Daily, © 1995, 2003, 2005, 2007-2008 Los Angeles Times-Washington Post News Service, Inc., © 2000 National Broadcasting Company, Inc., © 1999, 2002, 2004-2008, 2010 New York Times, © 2000 Public Radio International, © 1994-1998, 2001-2008 Xinhua News Agency, © 2020, 2024 Trustees of the University of Pennsylvania | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2024T11 | |
DateStamp: | 2024-12-16 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Vanroy, Bram. 2024. Linguistic Data Consortium. | |
Terms: | area_Europe country_ES country_IE country_NL dcmi_Text iso639_gle iso639_nld iso639_spa olac_primary_text |