![]() |
OLAC Record oai:catalogue.elra.info:ELRA-W0049 |
| Metadata | ||
| Title: | "Le Monde Diplomatique" Arabic tagged corpus | |
| Access Rights: | Rights available for: nonCommercialUse, commercialUse | |
| Date Available (W3CDTF): | 2009-03-31 | |
| Date Issued (W3CDTF): | 2009-03-31 | |
| Date Modified (W3CDTF): | 2009-03-31 | |
| Description: | This corpus contains 102,960 vowelised, lemmatised and tagged words (58 texts from Le Monde Diplomatique Arabic, see also ELRA-W0036-04). To each text are associated 3 files :-raw text in Arabic,-vowelized text in Arabic,-one XML file containing the morphological annotation of the text. Each text word associates a certain number of information, such as word size, rank of the word in the text, paragraph number where the word was found, etc. Each word associates a node in the XML file. Each node contains the following positional features of the word in the text:-Paragraph number in the text, i.e. paragraph where the word can be found,-Sentence number in the paragraph,-Sentence number in the text,-Rank of the word in the text,-Rank of the first character of the word in the text,-Word size.Information about word annotation are added as « sub-nodes »:-Word of non vowelised text,-Vowelised word,-Word lemma,-Grammatical category of the word. | |
| Identifier: | ELRA-W0049 | |
| ISLRN: 124-139-628-259-2 | ||
| Identifier (URI): | https://catalog.elra.info/en-us/repository/browse/ELRA-W0049/ | |
| Language: | Arabic | |
| Language (ISO639): | ara | |
| Medium: | Not specified | |
| Publisher: | ELRA (European Language Resources Association) | |
| Type (DCMI): | Text | |
| Type (OLAC): | primary_text | |
OLAC Info |
||
| Archive: | ELRA Catalogue of Language Resources | |
| Description: | http://www.language-archives.org/archive/catalogue.elra.info | |
| GetRecord: | OAI-PMH request for OLAC format | |
| GetRecord: | Pre-generated XML file | |
OAI Info |
||
| OaiIdentifier: | oai:catalogue.elra.info:ELRA-W0049 | |
| DateStamp: | 2009-03-31 | |
| GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
| Citation: | n.a. 2009. ELRA (European Language Resources Association). | |
| Terms: | dcmi_Text iso639_ara olac_primary_text | |