GIDS – Goisern and Innervillgraten Audiovisual Dialect Speech Corpus


The GIDS Goisern and Innervillgraten Dialect Seech Corpus is a collection of audiovisual speech recordings for research purposes. It consists of a total of 7068 sentences spoken by eights speakers from two Austrian towns, Bad Goisern and Innervillgraten. For each speaker, about two thirds of the recorded sentences are in the speaker's respective dialect and the rest is in Standard Austrian German. The dialect of Bad Goisern in the Salzkammergut region belongs to the Central Bavarian dialect family, and the dialect of Innervillgraten in the East Tyrol region belongs to the Southern Bavarian dialect family, as illustrated below (image based on

dialect map

The following table gives the number of recorded sentences for each speaker.

SpeakerDialectGenderDialect SentencesStandard Sentences
1Bad Goisernfemale665223
2Bad Goisernfemale665223
3Bad Goisernmale665223
4Bad Goisernmale665223

The recordings consist of optical 3D facial motion tracking data, captured with a NaturalPoint OptiTrack Expression system, the greyscale video data also recorded by the same system, and studio quality audio.

Related Publications

There is no dedicated publication for this corpus, however it is mentioned in the PhD dissertation of Dietmar Schabus (PDF, bibtex) and in this article:

Dietmar Schabus, Michael Pucher, and Gregor Hofer. "Joint Audiovisual Hidden Semi-Markov Model-based Speech Synthesis". In IEEE Journal of Selected Topics in Signal Processing, Vol. 8, No. 2, pp. 336-347, April 2014.

This open access article can be obtained via Please cite this article if you use the corpus for your research.

There are two related corpora, the MMASCS corpus and the FMSC corpus, for each of which a corresponding LREC conference paper exists (paper 1, paper 2). Some of the information in these papers is relevant also for the GIDS corpus.


For each of the recorded utterances, the corpus contains:

Obtaining the Data

The data can be obtained free of charge for scientific research purposes (see the license). Contact Michael Pucher ( if you would like to obtain a copy.


The GIDS corpus was created in the research project Adaptive Audio-Visual Dialect Speech Synthesis (AVDS), funded by the Austrian Science Fund (FWF) under the project number P22890-N23. This project was coordinated by the Telecommunications Research Center Vienna (FTW). The Competence Center "FTW Forschungszentrum Telekommunikation Wien GmbH" is funded within the program COMET Competence Centers for Excellent Technologies by BMVIT, BMWFW and the City of Vienna. The COMET program is managed by the FFG.

The recordings were carried out in the premises of the Acoustics Research Institute of the Austrian Academy of Sciences.


Michael Pucher (