skip to content

General Linguistics
Department of Linguistics
D-50923 Cologne

Tel: +49-221-470-3357
Fax: +49-221-470-5947

Office hours:
by appointment

(please contact via e-mail)

E-Mail: seisenb1(at)uni-koeln.de

Corpora

The Eisenbeiss corpus provides data from unimpaired monolingual German children and adults and involves a combination of spontaneous speech and semi-structured elicitaion games. Parts of the data have been transcribed in CHAT-format and video-linked using ELAN. The corpus has been funded funded by the MPI society and is hosted in their archive

  • Eisenbeiss, Sonja and Sonnenstuhl, Ingrid (2011b) A CHAT-based annotation scheme for case and noun-phrase inflection in child language data. Essex Research Reports in Linguistics 60,3.

  • Eisenbeiss, Sonja (2011c) CEGS: An elicitation took kit for studies on case marking and its acquisition. Essex Research Reports in Linguistics 60,1.

  • Eisenbeiss, Sonja and Sonnenstuhl, Ingrid (2011d) Transcription conventions for the Eisenbeiss German child language corpora. Essex Research Reports in Linguistics 60,2

  • Eisenbeiss, Sonja (2009) Contrast is the Name of the Game: Contrast-Based Semi-Structured Elicitation Techniques for Studies on Children’s Language Acquisition. Essex Research Reports in Linguistics (ERRL) 57.7.

The transcriptions are available for collaborative projects. The corpus consists of two sub-corpora.

Sub-Corpus 1: The L-Family Corpus

The L-Family corpus involves more than 1000 recordings from a two-year observation of a monolingual German family with four children and two adults. Tab.1 gives an overview:

Table 1. Children involved in the L-Family Corpus

ChildGenderAgeYear of birthDay-careSchool
L1Male5;2–7;819931997–2000from 2000
L2Male2;0–4;61996from 1997
L3Male0–2;51999from 2000
L4female0–0;42001

The data collection started in December 1998 with irregular recordings. From June 1999 on, several recordings of varying length were made each week until June 2001. The corpus combines (1) spontaneous speech of children, parents, and guests collected during meals and free play, and (2) semi-structured elicitation games targeted at case marking and noun-phrase-internal agreement marking.

Sub-Corpus 2: The Case Elicitation Corpus

The corpus contains semi-structured elicitation data from more than 40 two- to five-year old monolingual German children. It was collected using a combination of elicitation tasks for case and noun-phrase-internal agreement.