OLAC Record: Beldeko Summary Corpus v1.0.0

OLAC Record
oai:clarin.eurac.edu:20.500.12124/15

Metadata

Title: Beldeko Summary Corpus v1.0.0

Bibliographic Citation: http://hdl.handle.net/20.500.12124/15

Creator: Strobl, Carola

Date (W3CDTF): 2020-02-17T15:45:45Z

Date Available: 2020-02-17T15:45:45Z

Description: Beldeko Summary Corpus v1.0.0 The Beldeko (Belgisches Deutschkorpus) Summary Corpus is a learner corpus that consists of summaries written by advanced L2 German learners (CEF level B2-C1) with L1 Dutch. It has been created with the aim of investigating the academic writing skills in L2 German of third-year students of two bachelor programmes in Applied Linguistics and Linguistics and Literature, respectively. The corpus consists of 301 summaries (70774 tokens) written by 115 students of three intact classes (convenience sampling). The texts were collected at Ghent University (in 2013 and in 2014) and University College of Ghent (in 2013) as pre- and posttests of an intervention study on collaborative writing carried out by Carola Strobl in the context of her PhD research (Strobl, C. (2015). Affordances of online technologies for academic writing instruction in a foreign language. Ghent University, unpublished doctoral dissertation). 82 students produced three summaries each (pretest, posttest immediately after the three-weeks-intervention, delayed posttest six weeks after the intervention; missing data are indicated as n.a. in the metadata file) and 33 students produced two summaries each (pretest and posttest, missing data are indicated as n.a. in the metadata file). The metadata file (Beldeko_Summary_1.0.0_metadata.xlsx) provides information about: • Institution of data collection (HG= University College of Ghent, UG= Ghent University) • Year of data collection (2013, 2014) • Participants´ gender (f, m) • Number of texts written and number of tokens in each text (T1, T2, T3) The individual file names of the corpus reveal institution, year, unique ID of participant (per institution per year), text number, in the given order. The summaries contain between 37-330 words each, with a mean of 230 words (the targeted word count was between 220-250 words). Outliers regarding text length were unfinished texts produced by students who struggled with the time restriction of 60 minutes. The texts were written in class, on computers. Students were allowed to use online auxiliary means such as dictionaries. The task consisted in summarizing two texts (fragments of newspaper articles or interviews or websites) about a topic related to language variation in German each time (Kiezdeutsch, Mundartdebatte in der Schweiz, Viadrinisch, Varianten-Wörterbuch des Deutschen; see also word files provided in metadata). More specifically, the topics were distributed as follows: Kiezdeutsch: HG_2013_T1, UG_2013_T1, UG_2014_T1 Mundartdebatte in der Schweiz: HG_2013_T2, UG_2013_T2, UG_2014_T2 Viadrinisch: HG_2013_T3, Varianten-Wörterbuch des Deutschen: UG_2014_T3

Identifier (URI): http://hdl.handle.net/20.500.12124/15

Is Replaced By (URI): http://hdl.handle.net/20.500.12124/68

Language: German

Language (ISO639): deu

Publisher: Ghent University

Rights: Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

https://creativecommons.org/licenses/by-nc-nd/4.0/

Subject: academic writing

L1 Dutch

L2 German

learner corpus

Type: corpus

Type (DCMI): Text

Type (OLAC): primary_text

OLAC Info

Archive: Eurac Research CLARIN Centre

Description: http://www.language-archives.org/archive/clarin.eurac.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:clarin.eurac.edu:20.500.12124/15

DateStamp: 2023-10-27

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Strobl, Carola. 2020. Ghent University.
Terms: area_Europe country_DE dcmi_Text iso639_deu olac_primary_text

http://www.language-archives.org/item.php/oai:clarin.eurac.edu:20.500.12124/15
Up-to-date as of: Fri Oct 17 1:18:46 EDT 2025

Metadata
Title:		Beldeko Summary Corpus v1.0.0
Bibliographic Citation:		http://hdl.handle.net/20.500.12124/15
Creator:		Strobl, Carola
Date (W3CDTF):		2020-02-17T15:45:45Z
Date Available:		2020-02-17T15:45:45Z
Description:		Beldeko Summary Corpus v1.0.0 The Beldeko (Belgisches Deutschkorpus) Summary Corpus is a learner corpus that consists of summaries written by advanced L2 German learners (CEF level B2-C1) with L1 Dutch. It has been created with the aim of investigating the academic writing skills in L2 German of third-year students of two bachelor programmes in Applied Linguistics and Linguistics and Literature, respectively. The corpus consists of 301 summaries (70774 tokens) written by 115 students of three intact classes (convenience sampling). The texts were collected at Ghent University (in 2013 and in 2014) and University College of Ghent (in 2013) as pre- and posttests of an intervention study on collaborative writing carried out by Carola Strobl in the context of her PhD research (Strobl, C. (2015). Affordances of online technologies for academic writing instruction in a foreign language. Ghent University, unpublished doctoral dissertation). 82 students produced three summaries each (pretest, posttest immediately after the three-weeks-intervention, delayed posttest six weeks after the intervention; missing data are indicated as n.a. in the metadata file) and 33 students produced two summaries each (pretest and posttest, missing data are indicated as n.a. in the metadata file). The metadata file (Beldeko_Summary_1.0.0_metadata.xlsx) provides information about: • Institution of data collection (HG= University College of Ghent, UG= Ghent University) • Year of data collection (2013, 2014) • Participants´ gender (f, m) • Number of texts written and number of tokens in each text (T1, T2, T3) The individual file names of the corpus reveal institution, year, unique ID of participant (per institution per year), text number, in the given order. The summaries contain between 37-330 words each, with a mean of 230 words (the targeted word count was between 220-250 words). Outliers regarding text length were unfinished texts produced by students who struggled with the time restriction of 60 minutes. The texts were written in class, on computers. Students were allowed to use online auxiliary means such as dictionaries. The task consisted in summarizing two texts (fragments of newspaper articles or interviews or websites) about a topic related to language variation in German each time (Kiezdeutsch, Mundartdebatte in der Schweiz, Viadrinisch, Varianten-Wörterbuch des Deutschen; see also word files provided in metadata). More specifically, the topics were distributed as follows: Kiezdeutsch: HG_2013_T1, UG_2013_T1, UG_2014_T1 Mundartdebatte in der Schweiz: HG_2013_T2, UG_2013_T2, UG_2014_T2 Viadrinisch: HG_2013_T3, Varianten-Wörterbuch des Deutschen: UG_2014_T3
Identifier (URI):		http://hdl.handle.net/20.500.12124/15
Is Replaced By (URI):		http://hdl.handle.net/20.500.12124/68
Language:		German
Language (ISO639):		deu
Publisher:		Ghent University
Rights:		Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Rights:		https://creativecommons.org/licenses/by-nc-nd/4.0/
Subject:		academic writing
		L1 Dutch
		L2 German
		learner corpus
Type:		corpus
Type (DCMI):		Text
Type (OLAC):		primary_text
OLAC Info
Archive:		Eurac Research CLARIN Centre
Description:		http://www.language-archives.org/archive/clarin.eurac.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:clarin.eurac.edu:20.500.12124/15
DateStamp:		2023-10-27
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Strobl, Carola. 2020. Ghent University.
Terms:		area_Europe country_DE dcmi_Text iso639_deu olac_primary_text