MMASCS – Multi-Modal Annotated Synchronous Corpus of Speech

Intro

The MMASCS multi-modal annotated synchronous corpus of audio, video, facial motion and tongue motion data of normal, fast and slow speech is a collection of speech recordings for research purposes. It consists of a total of 770 sentences spoken by a male Austrian German speaker, recorded at slow, normal and fast speaking rates, in in the following modalities:

The following video shows an example sentence:

Related Publications

A detailed description of the MMASCS corpus was presented at the 9th edition of the Language Resources and Evaluation Conference (LREC) in May 2014:

Dietmar Schabus, Michael Pucher, and Phil Hoole. "The MMASCS multi-modal annotated synchronous corpus of audio, video, facial motion and tongue motion data of normal, fast and slow speech". In Proceedings of the 9th Language Resources and Evaluation Conference (LREC), pp. 3411–3416, Reykjavik, Iceland, May 2014.

This paper can be downloaded from the LREC website, a preprint version of it can be downloaded here, and BibTeX information can be obtained here. Please cite this paper if you use the corpus for your research.

Contents

For each of the recorded utterances (320 at normal, 320 at fast and 130 at slow speaking rate), the corpus contains:

Furthermore, the package contains:

Obtaining the Data

The data can be obtained free of charge for scientific research purposes (see the license). Contact Michael Pucher (michael.pucher@oeaw.ac.at) if you would like to obtain a copy.

Acknowledgements

The MMASCS corpus was created in the research project Adaptive Audio-Visual Dialect Speech Synthesis (AVDS), funded by the Austrian Science Fund (FWF) under the project number P22890-N23. This project was coordinated by the Telecommunications Research Center Vienna (FTW). The Competence Center "FTW Forschungszentrum Telekommunikation Wien GmbH" is funded within the program COMET Competence Centers for Excellent Technologies by BMVIT, BMWFW and the City of Vienna. The COMET program is managed by the FFG.

The recordings were carried out in the premises of the Institute of Phonetics and Speech Processing of the Ludwig-Maximilians Universität Munich.

Contact

Michael Pucher (michael.pucher@tugraz.at)