![]() |
OLAC Record oai:catalogue.elra.info:ELRA-W0120 |
| Metadata | ||
| Title: | NUM 5M Mongolian written corpus | |
| Access Rights: | Rights available for: nonCommercialUse, commercialUse | |
| Date Available (W3CDTF): | 2017-07-12 | |
| Date Issued (W3CDTF): | 2017-07-12 | |
| Date Modified (W3CDTF): | 2017-08-17 | |
| Description: | This is a corpus of Mongolian text mostly from domains like online or printed daily newspapers, literature, and laws.The collected raw texts was reduced from 5 to 4.8 million words after cleaning. The cleaned corpus comprises:- 144 texts from laws until 2009, - 288 texts from literature that is currently being used in the primary and secondary school text books in Mongolia (including stories, novels, novelettes),- 1,134 editorals from the printed newspaper "Unen" dating from 1984 to 1989,- 2,477 online newswire texts dating from 2003 to 2009. Part of this corpus, about 2,800 sentences with 100,000 words, has been POS-tagged manually and stored in XML TEI format. | |
| Identifier: | ELRA-W0120 | |
| ISLRN: 492-817-146-504-9 | ||
| Identifier (URI): | https://catalog.elra.info/en-us/repository/browse/ELRA-W0120/ | |
| Language: | Mongolian | |
| Language (ISO639): | mon | |
| Medium: | Not specified | |
| Publisher: | ELRA (European Language Resources Association) | |
| Type (DCMI): | Text | |
| Type (OLAC): | primary_text | |
OLAC Info |
||
| Archive: | ELRA Catalogue of Language Resources | |
| Description: | http://www.language-archives.org/archive/catalogue.elra.info | |
| GetRecord: | OAI-PMH request for OLAC format | |
| GetRecord: | Pre-generated XML file | |
OAI Info |
||
| OaiIdentifier: | oai:catalogue.elra.info:ELRA-W0120 | |
| DateStamp: | 2017-07-12 | |
| GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
| Citation: | n.a. 2017. ELRA (European Language Resources Association). | |
| Terms: | dcmi_Text iso639_mon olac_primary_text | |