Accessing the corpus
The annotated resources are distributed HERE in two formats: a TEI-compliant XML format (.xml files) and an Analec format (.ec files).
Description of the corpus
The excerpts were selected from the Santa Barbara Corpus of Spoken American English (for English),the ESLO Corpus, plus the OTG Corpus and the AccueilUBS Corpus (for French), the VoLip Corpus (for Italian) annotated for epistemic modality with the aim of obtaining three resources balanced in terms of size (about 20.000 words per languages) and communication context (public vs. private vs. broadcasting vs. family context; young and adults participants; free and directed turn-taking, two or more than two participants).
Public context, directed turn-taking, adult participants Italian MODAL- IT01,LIPRC10, 2149 words English MODAL- EN01,SBC039, French MODAL- FR01,ESLO2SOUTENANCE, 2604 words
Public context, free turn-taking, adult participants Italian MODAL- IT02,LIPRA7, 2770 words English MODAL- EN02,SBC008 French MODAL- FR02a1,ESLO24H_apresmiditravail MODAL- FR02a2,ESLO24H_apresmiditravail MODAL- FR02b,OTG1AP0316 MODAL- FR02c,UBS028
Broadcasting, free turn-taking, adult participants Italian MODAL- IT03,LIPRE11, 4019 words English MODAL- EN03,SBC053 French MODAL- FR03,ESLO2MEDIA13
Private context, free turn-taking, young participants Italian MODAL- IT04,LIPRA1, 3329 words English MODAL- EN04,SBC007 French MODAL- FR04,ESLO224DEBUTJOURNEE
Family context, free turn-taking, young and adult participants Italian MODAL- IT05,LIPFA1, 2000 words English MODAL- EN05,SBC019 French MODAL- FR05,ESLO2REP0102
Private context, free turn-taking, adult participants Italian MODAL- IT06,LIPRA3, 5398 words English MODAL- EN06,SBC002 French MODAL- FR06,ESLO2REP22
The Modal Corpus is published under the terms of the Creative Commons Attribution-NonCommercial ShareAlike 4.0 International licence.
Please, follow these instructions when using data drawn from the Modal Corpus.