OLAC Record oai:lindat.mff.cuni.cz:11234/1-1454 |
Metadata | ||
Title: | EnTam: An English-Tamil Parallel Corpus (EnTam v2.0) | |
Bibliographic Citation: | http://hdl.handle.net/11234/1-1454 | |
Creator: | Ramasamy, Loganathan | |
Bojar, Ondřej | ||
Žabokrtský, Zdeněk | ||
Date (W3CDTF): | 2014-10-31T23:07:27Z | |
Date Available: | 2014-10-31T23:07:27Z | |
Description: | EnTam is a sentence aligned English-Tamil bilingual corpus from some of the publicly available websites that we have collected for NLP research involving Tamil. The standard set of processing has been applied on the the raw web data before the data became available in sentence aligned English-Tamil parallel corpus suitable for various NLP tasks. The parallel corpus includes texts from bible, cinema and news domains. | |
Identifier (URI): | http://hdl.handle.net/11234/1-1454 | |
Language: | English | |
Tamil | ||
Language (ISO639): | eng | |
tam | ||
Publisher: | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) | |
Rights: | Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) | |
http://creativecommons.org/licenses/by-nc-sa/3.0/ | ||
Subject: | parallel corpus | |
Type: | corpus | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University | |
Description: | http://www.language-archives.org/archive/lindat.mff.cuni.cz | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:lindat.mff.cuni.cz:11234/1-1454 | |
DateStamp: | 2021-06-29 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Ramasamy, Loganathan; Bojar, Ondřej; Žabokrtský, Zdeněk. 2014. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL). | |
Terms: | area_Asia area_Europe country_GB country_IN dcmi_Text iso639_eng iso639_tam olac_primary_text |