![]() |
OLAC Record oai:lindat.mff.cuni.cz:11234/1-5682 |
Metadata | ||
Title: | Corpus from the Aozora Bunko Library | |
Bibliographic Citation: | http://hdl.handle.net/11234/1-5682 | |
Creator: | Rohacek, Jakub | |
Date (W3CDTF): | 2025-02-03T10:50:20Z | |
Date Available: | 2025-02-03T10:50:20Z | |
Description: | This corpus contains a subset of available texts from the Aozora Bunko public library project, which contains various works of mostly older literature in Japanese. A custom python script was used to compile it from its official GitHub directory in order to fit specific requirements. It excluded any text currently not freely available in the public domain and organized the output into approximately same-sized text files. Furthermore, they contain an XML structure using | |
Identifier (URI): | http://hdl.handle.net/11234/1-5682 | |
Language: | Japanese | |
Language (ISO639): | jpn | |
Publisher: | Masaryk University, NLP Centre | |
Rights: | Creative Commons - Attribution 4.0 International (CC BY 4.0) | |
http://creativecommons.org/licenses/by/4.0/ | ||
Subject: | Aozora | |
Bunko | ||
Corpus | ||
Japanese | ||
Literature | ||
Type: | corpus | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University | |
Description: | http://www.language-archives.org/archive/lindat.mff.cuni.cz | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:lindat.mff.cuni.cz:11234/1-5682 | |
DateStamp: | 2025-02-03 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Rohacek, Jakub. 2025. Masaryk University, NLP Centre. | |
Terms: | area_Asia country_JP dcmi_Text iso639_jpn olac_primary_text |