Dr. Sonja Eisenbeiß

Allgemeine Sprachwissenschaft
Institut für Linguistik
D-50923 Köln

Tel: +49-221-470-3357
Fax: +49-221-470-5947

Sprechstunde:
nach Vereinbarung

(Bitte per E-Mail anmelden)

Korpusdaten

The Eisenbeiss corpus provides data from unimpaired monolingual German children and adults and involves a combination of spontaneous speech and semi-structured elicitaion games. Parts of the data have been transcribed in CHAT-format and video-linked using ELAN. The corpus has been funded funded by the MPI society and is hosted in their archive

Eisenbeiss, Sonja and Sonnenstuhl, Ingrid (2011b) A CHAT-based annotation scheme for case and noun-phrase inflection in child language data. Essex Research Reports in Linguistics 60,3.
Eisenbeiss, Sonja (2011c) CEGS: An elicitation took kit for studies on case marking and its acquisition. Essex Research Reports in Linguistics 60,1.
Eisenbeiss, Sonja and Sonnenstuhl, Ingrid (2011d) Transcription conventions for the Eisenbeiss German child language corpora. Essex Research Reports in Linguistics 60,2
Eisenbeiss, Sonja (2009) Contrast is the Name of the Game: Contrast-Based Semi-Structured Elicitation Techniques for Studies on Children’s Language Acquisition. Essex Research Reports in Linguistics (ERRL) 57.7.

The transcriptions are available for collaborative projects. The corpus consists of two sub-corpora.

Sub-Corpus 1: The L-Family Corpus

The L-Family corpus involves more than 1000 recordings from a two-year observation of a monolingual German family with four children and two adults. Tab.1 gives an overview:

Table 1. Children involved in the L-Family Corpus

Child	Gender	Age	Year of birth	Day-care	School
L1	Male	5;2–7;8	1993	1997–2000	from 2000
L2	Male	2;0–4;6	1996	from 1997	—
L3	Male	0–2;5	1999	from 2000	—
L4	female	0–0;4	2001	—	—

The data collection started in December 1998 with irregular recordings. From June 1999 on, several recordings of varying length were made each week until June 2001. The corpus combines (1) spontaneous speech of children, parents, and guests collected during meals and free play, and (2) semi-structured elicitation games targeted at case marking and noun-phrase-internal agreement marking.

Sub-Corpus 2: The Case Elicitation Corpus

The corpus contains semi-structured elicitation data from more than 40 two- to five-year old monolingual German children. It was collected using a combination of elicitation tasks for case and noun-phrase-internal agreement.

UNIVERSITÄT ZU KÖLN

Philosophische FakultätAllgemeine Sprachwissenschaft (Institut für Linguistik)

Dr. Sonja Eisenbeiß

Korpusdaten

Sub-Corpus 1: The L-Family Corpus

Table 1. Children involved in the L-Family Corpus

Sub-Corpus 2: The Case Elicitation Corpus