OLAC Record
oai:paradisec.org.au:DGB1-2023corpus_dict

Metadata
Title:Corpus and dictionary files for 2023
Access Rights:Open (subject to agreeing to PDSC access conditions)
Bibliographic Citation:Danielle Barth (collector), Danielle Barth (compiler), 2023. Corpus and dictionary files for 2023. PLAIN/RTF. DGB1-2023corpus_dict at catalog.paradisec.org.au. https://dx.doi.org/10.26278/9a0q-6423
Contributor (compiler):Danielle Barth
Coverage (Box):northlimit=-4.896; southlimit=-4.919; westlimit=145.77; eastlimit=145.787
Coverage (ISO3166):PG
Date (W3CDTF):2023-03-31
Date Created (W3CDTF):2023-03-31
Description:A compiled Matukar Panau corpus of 150,740 words, including words in context, speaker metadata, file metadata and where available parsing and glossing and translations. A subset of this corpus is included in a separate file as a morpheme corpus with parsing and glossing of 20,359 morphemes. Most files have been standardized for spelling. The spelling standardization script package for ELAN was developed by Jake Farrell, AI Specialist at Appen, for the use by CoEDL researchers. A lexicon from ELAN In xml format is included. An annotation guideline for clause chains is also included. Annotations are in tiers with the ELAN type "chain". . Language as given:
Format:Digitised: no
Identifier:DGB1-2023corpus_dict
Identifier (URI):http://catalog.paradisec.org.au/repository/DGB1/2023corpus_dict
Language:Matukar
Tok Pisin
Language (ISO639):mjk
tpi
Rights:Open (subject to agreeing to PDSC access conditions)
Subject:Matukar language
Subject (ISO639):mjk
Subject (OLAC):language_documentation
Table Of Contents (URI):http://catalog.paradisec.org.au/repository/DGB1/2023corpus_dict/DGB1-2023corpus_dict-clause_chain_annos.txt
http://catalog.paradisec.org.au/repository/DGB1/2023corpus_dict/DGB1-2023corpus_dict-dir_annos.rtf
http://catalog.paradisec.org.au/repository/DGB1/2023corpus_dict/DGB1-2023corpus_dict-morphs.txt
http://catalog.paradisec.org.au/repository/DGB1/2023corpus_dict/DGB1-2023corpus_dict-words.txt

OLAC Info

Archive:  Pacific And Regional Archive for Digital Sources in Endangered Cultures (PARADISEC)
Description:  http://www.language-archives.org/archive/paradisec.org.au
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:paradisec.org.au:DGB1-2023corpus_dict
DateStamp:  2024-11-19
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Danielle Barth (compiler); Danielle Barth (compiler). 2023. Pacific And Regional Archive for Digital Sources in Endangered Cultures (PARADISEC).
Terms: area_Pacific country_PG iso639_mjk iso639_tpi olac_language_documentation

Inferred Metadata

Country: Papua New Guinea
Area: Pacific


http://www.language-archives.org/item.php/oai:paradisec.org.au:DGB1-2023corpus_dict
Up-to-date as of: Tue Mar 4 8:50:29 EST 2025