Michael Pucher |
|
Mag.phil. Dipl.-Ing. Dr.techn. Michael Pucher (michael dot pucher at tugraz dot at) |
|
Head of Technology Design, Recognosco, Donau-City-Straße 1, Vienna A-1220, Austria. | |
Senior Researcher, Austrian Research Institute for Artificial Intelligence (OFAI), Freyung 6/6, 1010 Vienna, Austria. | |
I obtained my doctoral degree (Dr.techn.) in Electrical and Information Engineering from Graz University of Technology in 2007. In 2017 I received the venia docendi in Speech Communication at Graz University of Technology with a habilitation thesis on Speech Processing for Multimodal and Adaptive Systems. I hold a master degree (Dipl.-Ing.) in Computer Science from Vienna University of Technology (TUW) and a diploma degree (Mag.phil.) in philosophy from University of Vienna. My research interests are acoustic modeling for speech recognition, semantic language modeling, speech synthesis for language varieties, persona design for speech based systems, multimodal and spoken dialog systems, audio-visual speech synthesis, synthesis of singing, synthesis of animal vocalisations, digital phonetics, and sociophonetics. I have also made significant contributions in the area of speaker verification spoofing, where we showed how adaptive synthesizers can spoof a speaker verification system. From 2007 to 2015 I was Senior Researcher at the Telecommunications Research Center Vienna (FTW). From 2016 to 2022 I was Senior Research Scientist at the Acoustics Research Institute (ARI) of the Austrian Academy of Sciences (ÖAW). Since 2022 I am Senior Researcher at the Austrian Research Institute for Artificial Intelligence (OFAI) and Senior Speech Technologist at Recognosco, Vienna, Austria. Since 2024 I am Head of Technology Design at Recognosco, Vienna, Austria. | |
Projects |
|
Publications |
|
Teaching |
|
Curriculum Vitae |
|
Professional activities |
|
Media appearance |
|
Projects |
|
Ongoing Research Projects |
|
[FWF - TCS154] Dialect classification by human and artificial intelligence (as principal investigator) 2024 - 2025 | |
Completed Research Projects |
|
[FWF - I2539/I4655] Vowel and consonant quantity in Southern German varieties (as principal investigator) 2015 - 2024 | |
[FWF - F6002] SFB Deutsch in Österreich. Variation – Kontakt – Perzeption (as national cooperation partner) 2016 - 2022 | |
[City of Vienna - ÖAW] The voice of Viennese women - An experimental-phonetic longitudinal study on the voice quality of young Viennese women (as principal investigator) 2019 | |
[NII - National Institute of Informatics, Japan] OPERA - Acoustic analysis and statistical modelling of Vienna opera singers (as external collaborator) 2013 - 2015 | |
[FWF: P23821-N23] AMTV - Acoustic modeling and transformation of varieties for speech synthesis (as principal investigator) 2012 - 2016 | |
[BMWF - Sparkling Science] SALB - Speech synthesis of auditory lecture books for blind children (as principal investigator) 2013 - 2015 | |
[FWF: P22890-N23] AVDS - Adaptive Audio-Visual Dialect Synthesis (as principal investigator) 2011-2014 | |
[EU-NET] EUCOG III: European Network for the Advancement of Artificial Cognitive Systems, Interaction and Robotics (as member) | |
[EU-COST] Cost 2102: Cross-Modal Analysis of Verbal and Non-verbal Communication (as representative) | |
[WWTF] VSDS - Viennese Sociolect and Dialect Synthesis (as principal investigator) | |
[COMET] HI-MONI - Highway Monitoring (as project manager) | |
[T-LABS] TIDE - Testbed for Interactive Dialog System Evaluation (as project manager) | |
[EU-FP6] AMI - Augmented Multiparty Interaction | |
[K-PLUS] MONA - Mobile Multimodal Next Generation Applications | |
[K-PLUS] Service Platform and Interoperability | |
[K-PLUS] Speech and More | |
Art Projects |
|
May 2022: Speech Synthesis for Burgtheater Wien, Keine Menschenseele | |
Development Projects |
|
See also Github | |
June 2023: DNN based synthetic voices in Austrian German and Austrian sociolects/dialects with dialect shifting. | |
May 2022: DNN based synthetic voices in Austrian German and Viennese sociolects/dialects. | |
November 2019: Synthetic voices in Austrian German and Viennese sociolects/dialects. | |
August 2018: Release of hts-engine-world, an integration of the WORLD vocoder and the hts_engine API-1.10. | |
December 2014: Bad Goisern and Innervillgraten Audio-Visual Dialect Speech Corpus (GIDS). | |
December 2014: Release of SALB - a frontend for speech synthesis using HTS voice models. | |
May 2014: Release of Multi-Modal Annotated Synchronous Corpus of Speech (MMASCS). | |
October 2013: Release of Austrian German open source HTS voice. | |
September 2010: "Leopold" available for Windows and Mac OSX from the Webshop of Cereproc, UK. | |
May 2010: Development of "Leopold" the first synthetic voice for Austrian German together with company partners, which was integrated into a web reading service for the Website of the City of Vienna. | |
Publications |
|
See also Google scholar citations | |
Invited talks |
|
Michael Pucher, Lorenz Gutscher, invited talk on Acoustic language embeddings and phonetic typology of Austrian German varieties, 1. February 2024, Workshop on vowel and consonant quantity in Germanic, Indo-European and beyond at the University of Zürich. | |
Michael Pucher, invited lecture on Synthesizing dialects, faces, singing voices, songbirds, and famous dead actors, 1. March 2023, ÖFAI Lecture Series, Vienna, Austria (Video). | |
Michael Pucher, invited lecture on Synthesizing faces, dialects, and singing voices, 27. June 2017, Faculty day, Faculty of Electrical and Information Engineering, Graz, Austria. | |
Michael Pucher, invited talk on Interpolation of language varieties in HMM-based speech synthesis, 23. May 2014, Workshop at National Institute of Informatics, Tokyo, Japan. | |
Michael Pucher, keynote talk on Acoustic modeling, interpolation, and transformation of language varieties for speech synthesis, 9.-11. April 2014, International Dagstuhl Workshop on Multilinguality in Speech Research: Data, Methods and Models, Dagstuhl, Germany. | |
Journal articles |
|
2022, Michael Pucher, Katharina Kranawetter, Eva Reinisch, Wolfgang Koppensteiner, Alexandra Lenz, Perceptual effects of interpolated Austrian and German Standard varieties. Speech Communication, April 2022. | |
2020, Michael Pucher, Nicola Klingler, Jan Luttenberger, Lorenzo Spreafico, Accuracy, recording interference, and articulatory quality of headsets for ultrasound recordings. Speech Communication, Volume 123, October 2020, pp. 83-97. | |
2020, Carina Lozo, Michael Pucher, Revisiting nonstandard variety TTS and its evaluation in Austria. The Phonetician, Number 117, pp. 34-44, 2020. | |
2019, Michael Pucher, Sylvia Moosmüller, Michaela Rausch-Supola, Aufnahme von authentischen Dialektdaten für die Verwendung in der Sprachsynthese. In S. Kürschner, M. Habermann, P. O. Müller (eds.), Methodik moderner Dialektforschung. Erhebung, Aufbereitung und Auswertung von Daten am Beispiel des Oberdeutschen. (pp. 105-123). Hildesheim: Olms. Germanistische Linguistik, 241-243/2019. | |
2017, Michael Pucher, Bettina Zillinger, Markus Toman, Dietmar Schabus, Cassia Valentini-Botinhao, Junichi Yamagishi, Erich Schmid, Thomas Woltron, Influence of speaker familiarity on blind and visually impaired children's and young adults' perception of synthetic voices. Computer, Speech, and Language Volume 46, November 2017, Pages 179-195. | |
2015, Cassia Valentini-Botinhao, Markus Toman, Michael Pucher, Dietmar Schabus, Junichi Yamagishi, Intelligibility of time-compressed synthetic speech: compression method and speaking style. Speech Communication, Volume 74, pp. 52-64, November 2015. | |
2015, Markus Toman, Michael Pucher, Sylvia Moosmüller, Dietmar Schabus, Unsupervised and phonologically controlled interpolation of Austrian German language varieties for speech synthesis. Speech Communication, Volume 72, pp. 176-193, September 2015 (Samples). | |
2014, Dietmar Schabus, Michael Pucher, Gregor Hofer, Joint Audiovisual Hidden Semi-Markov Model-based Speech Synthesis. IEEE Journal of Selected Topics in Signal Processing. Vol. 8, No. 2, pp. 336-347, April 2014 (Samples). | |
2012, Phillip L. De Leon, Michael Pucher, Junichi Yamagishi, Inma Hernaez, Ibon Saratxaga Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech. IEEE Transactions on Audio, Speech, and Language Processing, Volume 20, Issue 8, October 2012, Pages 2280-2290. | |
2010, Michael Pucher, Dietmar Schabus, Junichi Yamagishi, Friedrich Neubarth, Volker Strom, Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis. Speech Communication, Volume 52, Issue 2, February 2010, Pages 164-179. | |
2002, Georg Niklfeld, Michael Pucher, Robert Finan, Wolfgang Eckhart, Kombinierte Sprache/Display-Schnittstellen für mobile Datendienste. PIK - Praxis der Informationsverarbeitung und Kommunikation, 25 (4), pages 196-201. | |
Book chapters |
|
2022, Michael Pucher, Sylvia Moosmüller, Phonetic analysis of dialect/standard transitions synthesized by model-based interpolation. In Akustische Phonetik und ihre multidisziplinären Aspekte - Ein Gedenkband für Sylvia Moosmüller, Verlag ÖAW. | |
2008, Sebastian Möller, Klaus-Peter Engelbrecht, Michael Pucher, Peter Fröhlich, Lu Huo, Ulrich Heute, Frank Oberle, A New Testbed for Semi-automatic Usability Evaluation and Optimization of Spoken Dialogue Systems. In Usability of Speech Dialog Systems - Listening to the Target Audience (T. Hempel, ed.), pages 81-103, Springer, Berlin, Germany. | |
2005, Georg Niklfeld, Michael Pucher, Robert Finan, Wolfgang Eckhart, Wolfgang Minker, A Path to Multimodal Data Services for Telecommunications. In Spoken Multimodal Human-Computer Dialogue in Mobile Environments, pages 149-167, Springer, Netherlands. | |
Edited volumes |
|
2022, Michael Pucher, Peter Balazs (Hrsg.), Akustische Phonetik und ihre multidisziplinären Aspekte - Ein Gedenkband für Sylvia Moosmüller, Verlag ÖAW. | |
2020, Nicola Klingler, Michael Pucher (Eds.), The Phonetician - Journal of the International Society of Phonetic Sciences, Number 117. | |
2019, Michael Pucher (Chair), Proceedings of the 10th ISCA Speech Synthesis Workshop, 20-22 September 2019, Vienna, Austria, ISSN: 2312-2846. | |
2019, Michael Pucher, Jürgen Trouvian, Carina Lozo (Eds.), Proceedings of the Third International Workshop on the History of Speech Communication Research, TUDpress, Studientexte zur Sprachkommunikation, Band 94. | |
2015, Michael Pucher, Gernot Kubin, Darren Cosker, Chris Davis, Slim Ouni, William Smith, Eva Krumhuber (Organizers), Proceedings of FAAVSP - The 1st Joint Conference on Facial Analysis, Animation, and Auditory-Visual Speech Processing, Vienna, Austria, September 11-13, 2015, ISCA Archive. | |
2012, Michael Pucher, Darren Cosker, Gregor Hofer, Michael Berger, William Smith (Organizers), Proceedings of The 3rd International Symposium on Facial Analysis and Animation, Vienna, Austria, September 21, 2012, ACM Proceedings. | |
Conference and workshop papers |
|
2024 |
|
2024, Lorenz Gutscher, Michael Pucher, Exploring Phonetic Features in Language Embeddings for Unseen Language Varieties of Austrian German. KONVENS (Conference on Natural Language Processing), pp. 317-325, Vienna, Austria. | |
2023 |
|
2023, Lorenz Gutscher, Michael Pucher, Víctor Garcia, Neural Speech Synthesis for Austrian Dialects with Standard German Grapheme-to-Phoneme Conversion and Dialect Embeddings. 2nd Special Interest Group on Under-resourced Languages Annual Meeting 2023, Dublin, Ireland. | |
2023, Lisa Kerle, Michael Pucher, Barbara Schuppler, Speaker Interpolation based Data Augmentation for Automatic Speech Recognition. 20th International Congress of Phonetic Sciences (ICPhS), Prague, Czech Republic. | |
2022 |
|
2022, Michael Pucher, On the sociolects of robots. HRI 2022 Workshop - Robo-Identity: Exploring Artificial Identity and Emotion via Speech Interactions, Sapporo, Japan (virtual). | |
2022, Johanna Cronenberg, Nicola Klingler, Felicitas Kleber, Michael Pucher, On the role of asymmetry in prosodic change of consonant duration: Results from an agent-based model with two German varieties. Speech Prosody, Lisbon, Portugal. | |
2022, Lorenz Gutscher, Nicola Klingler, Michael Pucher, Einfluss von Entrauschungsverfahren auf die automatische Segmentierung mit WebMAUS. 33rd conference on electronic speech signal processing (ESSV), Sonderborg, Denmark. | |
2022, Lorenz Gutscher, Michael Pucher, Improving the quality of synthesized speech of a Viennese dialect speaker through speaker adaptation. 33rd conference on electronic speech signal processing (ESSV), Sonderborg, Denmark. | |
2021 |
|
2021, Michael Pucher, Thomas Woltron, Conversion of airborne to bone-conducted speech with deep neural networks. Interspeech 2021, Brno, Czech Republic, pp. 1-5. | |
2021, Nicola Klingler, Lorenz Gutscher, Michael Pucher, Einfluß der Entrauschung auf die automatische Segmentierung von historischen phonetischen Korpora mit MAUS. Phonetik und Phonologie im deutschsprachigen Raum (P&P) 2021, Frankfurt, Germany. | |
2021, Jan Luttenberger, Nina Weihs, Michael Pucher, L-Velarization in Austrian German dialect. New Ways of Analyzing Variation (NWAV49), Austin, USA. | |
2020 |
|
2020, Anton Noll, Michael Pucher, Carina Lozo, Formant tracking in Sound Tools eXtended (STx) 5.0. DAGA 2020 - 46. Jahrestagung für Akustik, Hannover, Germany, 2020, pp. 956-958. | |
2019 |
|
2019, Lorenz Gutscher, Michael Pucher, Carina Lozo, Marisa Hoeschele, Daniel Mann, Statistical parametric synthesis of budgerigar songs. SSW10 - The 10th ISCA Speech Synthesis Workshop, Vienna, Austria, 2019, pp. 127-131 (Samples). | |
2019, Michael Pucher, Carina Lozo, Philip Vergeiner, Dominik Wallner, Diphthong interpolation, phone mapping, and prosody transfer for speech synthesis of similar dialect pairs. SSW10 - The 10th ISCA Speech Synthesis Workshop, Vienna, Austria, 2019, pp. 200-204 (Samples). | |
2019, Carina Lozo, Jan Luttenberger, Michael Pucher, The thought collective behind thirty years of progress in speech synthesis. Proceedings of the 3rd International Workshop on the History of Speech Communication Research, Vienna, Austria, 2019. Studientexte zur Sprachkommunikation, TUDpress, Band 94, pp. 49-58. | |
2019, Anton Noll, Jonathan Stuefer, Nicola Klingler, Hannah Leykum, Carina Lozo, Jan Luttenberger, Michael Pucher, Carolin Schmid, Sound Tools eXtended (STx) 5.0 – a powerful sound analysis tool optimized for speech. In Proccedings of Interspeech 2019 - Show&Tell, Graz, Austria, 2019, pp. 2370-2371. | |
2019, Nicola Klingler, Felicitas Kleber, Markus Jochim, Michael Pucher, Stephan Schmid, Urban Zihlmann, Temporal organization of vowel plus stop sequences in production and perception: evidence from the three major varieties of German. International Congress of Phonetic Sciences (ICPhS), Melbourne, Australia, 2019, pp. 825-829. | |
2019, Stephan Schmid, Markus Jochim, Nicola Klingler, Michael Pucher, Urban Zihlmann, Felicitas Kleber, Vowel and consonant quantity in southern German varieties: Typology, variation, and change. Third Phonetics and Phonology in Europe conference, Lecce, Italy. | |
2018 |
|
2018, Lorenzo Spreafico, Michael Pucher, Anna Matosova, UltraFit: A speaker-friendly headset for ultrasound recordings in speech science. In Proccedings of Interspeech 2018, Hyderabad, India, 2018, pp. 1517-1520. | |
2018, Carina Lozo, Michael Pucher, Zum Einfluss der Persona auf die Bewertung von synthetisierter Sprache. 44. Österreichische Linguistiktagung (ÖLT 2018), Innsbruck, Austria. | |
2018, Markus Jochim, Felicitas Kleber, Nicola Klingler, Michael Pucher, Stephan Schmid, Urban Zihlmann, Measuring the Role of Hypoarticulation in a Sound Change in Progress in Southern German. Phonetik und Phonologie im deutschsprachigen Raum (P&P) 2018, Vienna, Austria, 2018, pp. 33-34. | |
2018, Michael Pucher, Carina Lozo, Sylvia Moosmüller, Evaluation methods for dialect speech synthesis of similar dialect pairs. DAGA 2018 - 44. Jahrestagung für Akustik, Munich, Germany, 2018, pp. 515-517. | |
2017 |
|
2017, Michael Pucher, Carina Lozo, Sylvia Moosmüller, Phone mapping and prosodic transfer in speech synthesis of similar dialect pairs. 28th Conference on Electronic Speech Signal Processing, Saarbrücken, Germany, 2017, pp. 180-185. | |
2016 |
|
2016, Michael Pucher, Michaela Rausch-Supola, Sylvia Moosmüller, Markus Toman, Dietmar Schabus, Friedrich Neubarth, Open data for speech synthesis of Austrian German language varieties. 12. Tagung Phonetik und Phonologie im deutschsprachigen Raum,, München, 2016, pp. 147-150. | |
2016, Michael Pucher, Fernando Villavicencio, Junichi Yamagishi, Development of a statistical parametric synthesis system for operatic singing in German. 9th ISCA Speech Synthesis Workshop (SSW9), Sunnyvale, CA, USA, pp. 64-69. (Samples). | |
2016, Michael Pucher, Sylvia Moosmüller, Michaela Rausch-Supola, Aufnahme von hochwertigen authentischen Dialektdaten im Feld. 13. Bayerisch-österreichische Dialektologentagung, Erlangen, Germany. | |
2016, Michael Pucher, Sylvia Moosmüller, Analysis of phonetic dialect/standard relations in model interpolation. Experimental Approaches to Perception and Production of Language Variation, Vienna, Austria. | |
2015 |
|
2015, Fernando Villavicencio, Jordi Bonada, Junichi Yamagishi, Michael Pucher, Efficient Pitch Estimation on Natural Opera-Singing by a Spectral Correlation based Strategy. Information Processing Society of Japan SIG Technical Report, Number 1, pp. 1-6. | |
2015, Michael Pucher, Dietmar Schabus, Visio-articulatory to acoustic conversion of speech. FAAVSP 2015, Vienna, Austria, Article No. 6. | |
2015, Dietmar Schabus, Michael Pucher, Comparison of dialect models and phone mappings in HSMM-based visual dialect speech synthesis. FAAVSP 2015, Vienna, Austria, pp. 84-87. | |
2015, Michael Pucher, Markus Toman, Dietmar Schabus, Cassia Valentini-Botinhao, Junichi Yamagishi, Bettina Zillinger, Erich Schmid, Influence of speaker familiarity on blind and visually impaired children's perception of synthetic voices in audio games. Proccedings of Interspeech 2015, Dresden, Germany, pp. 1625-1629. | |
2015, Markus Toman, Michael Pucher, Evaluation of state mapping based foreign accent conversion. Proccedings of Interspeech 2015, Dresden, Germany, pp. 304-308. | |
2015, Michael Pucher, Valon Xhafa, Agni Dika, Markus Toman, Adaptive speech synthesis of Albanian dialects. Text, Speech, and Dialogue (TSD) 2015, Pilsen, Czech Republic, pp. 158-164. | |
2015, Markus Toman, Michael Pucher, An Open Source Speech Synthesis Frontend for HTS. Text, Speech, and Dialogue (TSD) 2015, Pilsen, Czech Republic, pp. 291-298. | |
2014 |
|
2014, Cassia Valentini-Botinhao, Markus Toman, Michael Pucher, Dietmar Schabus, Junichi Yamagishi, Intelligibility analysis of fast synthesized speech. In Proccedings of Interspeech 2014, Singapore, pp. 2922-2926. | |
2014, Dietmar Schabus, Michael Pucher, Phil Hoole, The MMASCS multi-modal annotated synchronous corpus of audio, video, facial motion and tongue motion data of normal, fast and slow speech. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), Reykjavik, Iceland, pp. 3411-3416. | |
2013 |
|
2013, Jakob Hollenstein, Michael Pucher, Dietmar Schabus, Visual Control of Hidden-Semi-Markov-Model based Acoustic Speech Synthesis. International Conference on Auditory-Visual Speech Processing (AVSP 2013), Annency, France, pp. 31-35 (Samples). | |
2013, Dietmar Schabus, Michael Pucher, Gregor Hofer, Objective and Subjective Feature Evaluation for Speaker-Adaptive Visual Speech Synthesis. International Conference on Auditory-Visual Speech Processing (AVSP 2013), Annency, France, pp. 37-42. | |
2013, Markus Toman, Michael Pucher, Dietmar Schabus, Multi-variety adaptive acoustic modeling in HSMM-based speech synthesis. 8th ISCA Speech Synthesis Workshop (SSW8), Barcelona, Spain, pp. 83-87. | |
2013, Markus Toman, Michael Pucher, Dietmar Schabus, Cross-variety speaker transformation in HSMM-based speech synthesis. 8th ISCA Speech Synthesis Workshop (SSW8), Barcelona, Spain, pp. 77-81. | |
2013, Markus Toman, Michael Pucher, Structural KLD for cross-variety speaker adaptation in HMM-based speech synthesis. 10th IASTED International Conference on Signal Processing, Pattern Recognition and Applications (SPPRA2013), Innsbruck, Austria. | |
2012 |
|
2012, Dietmar Schabus, Michael Pucher, Gregor Hofer, Speaker-adaptive visual speech synthesis in the HMM-framework. 13th Annual Conference of the International Speech Communication Association (INTERSPEECH 2012), Portland, USA, pp. 979-982 (Samples). | |
2012, Ibon Saratxaga, Inma Hernaez, Michael Pucher, Eva Navas, Inaki Sainz, Perceptual Importance of the Phase Related Information in Speech. 13th Annual Conference of the International Speech Communication Association (INTERSPEECH 2012), Portland, USA, pp. 1448-1451. | |
2012, Michael Pucher, Dietmar Schabus, Gregor Hofer, Nadja Kerschhofer-Puhalo, Sylvia Moosmüller, Regionalizing Virtual Avatars - Towards Adaptive Audio-Visual Dialect Speech Synthesis. CogSys 2012, 5th International Conference on Cognitive Systems, Vienna, Austria, pp. 95. | |
2012, Michael Pucher, Nadja Kerschhofer-Puhalo, Dietmar Schabus, Sylvia Moosmüller, Gregor Hofer, Language resources for the adaptive speech synthesis of dialects. 7. Kongress der Internationalen Gesellschaft für Dialektologie und Geolinguistik (SIDG), Vienna, Austria, pp. 174-175. [presentation] | |
2012, Michael Pucher, Dietmar Schabus, Gregor Hofer, From Viennese to Austrian German and back again - An algorithm for the realization of a variety-slider. 7. Kongress der Internationalen Gesellschaft für Dialektologie und Geolinguistik (SIDG), Vienna, Austria, pp. 176-177. [presentation] | |
2012, Dietmar Schabus, Michael Pucher, Gregor Hofer, Building a synchronous corpus of acoustic and 3D facial marker data for adaptive audio-visual speech synthesis. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 3313-3316, Istanbul, Turkey. | |
2011 |
|
2011, Michael Pucher, Nadja Kerschhofer-Puhalo, Dietmar Schabus, Phone set selection for HMM-based dialect speech synthesis. 1st Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties (DIALECTS 2011). EMNLP 2011: Conference on Empirical Methods in Natural Language Processing, Edinburgh, UK, pp. 65-69. | |
2011, Dietmar Schabus, Michael Pucher, Gregor Hofer, Simultaneous Speech and Animation Synthesis. Poster at 38th International Conference and Exhibition on Computer Graphics and Interactive Techniques (SIGGRAPH 2011), Vancouver, Canada. [video] | |
2011, Phillip L. De Leon, Inma Hernaez, Ibon Saratxaga, Michael Pucher, Junichi Yamagishi, Detection of synthetic speech for the problem of imposture. In Proceedings of the 36th International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, pp. 4844-4847. | |
2011, Dietmar Schabus, Thomas Zemen, Michael Pucher, Distributed Field Estimation Algorithms in Vehicular Sensor Networks. IEEE 73rd Vehicular Technology Conference (VTC2011-Spring), Budapest, Hungary, pp. 1-5. | |
2010 |
|
2010, Michael Pucher, Dietmar Schabus, Junichi Yamagishi, Synthesis of fast speech with interpolation of adapted HSMMs and its evaluation by blind and sighted listeners. 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), Makuhari, Japan, pp. 2186-2189. | |
2010, Michael Pucher, Friedrich Neubarth, Volker Strom, Sylvia Moosmüller, Gregor Hofer, Christian Kranzler, Gudrun Schuchmann, Dietmar Schabus, Resources for speech synthesis of Viennese varieties. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC), Valletta, Malta, pp. 105-108. [presentation] | |
2010, Phillip L. De Leon, Michael Pucher, Junichi Yamagishi, Evaluation of the Vulnerability of Speaker Verification to Synthetic Speech. In Proceedings of Odyssey 2010 - The Speaker and Language Recognition Workshop, Brno, Czech Republic, pp. 151-158. | |
2010, Phillip L. De Leon, Vijendra Raj Apsingekar, Michael Pucher, Junichi Yamagishi, Revisiting the security of speaker verification systems against imposture using synthetic speech. In Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, USA, pp. 1798-1801. | |
2010, Michael Pucher, Dietmar Schabus, Peter Schallauer, Yuriy Lypetskyy, Franz Graf, Harald Rainer, Michael Stadtschnitzer, Sabine Sternig, Josef Birchbauer, Wolfgang Schneider, Bernhard Schalko, Multimodal Highway Monitoring for Robust Incident Detection. 13th International IEEE Conference on Intelligent Transportation Systems (ITSC), Madeira, Portugal, pp. 837-842. | |
2010, Michael Pucher, Friedrich Neubarth, Dietmar Schabus, Design and development of spoken dialog systems incorporating speech synthesis of Viennese varieties . In Proceedings of the 12th International Conference on Computers Helping People with Special Needs (ICCHP 2010), Vienna, Austria, pp. 361-366. | |
2009 |
|
2009, Michael Pucher, Friedrich Neubarth, Volker Strom, Optimizing phonetic encoding for Viennese dialect unit selection speech synthesis. COST 2102 conference, Dublin, 2009, LNCS 5967, pp. 207-216, 2010. | |
2009, Christian Kranzler, Franz Pernkopf, Rudolf Muhr, Michael Pucher, Friedrich Neubarth, Text-to-Speech Engine with Austrian German Corpus. In Proceedings of the XIII International conference Speech and Computer (SPECOM 2009), St. Petersburg, Russia. | |
2008 |
|
2008, Michael Pucher, Gudrun Schuchmann, Peter Fröhlich, Regionalized Text-to-Speech Systems: Persona Design and Application Scenarios. In Lecture Notes in Artificial Intelligence (LNAI), volume 5398, pages 216-222. COST Action 2102 School, Vietri sul Mare, Italy. | |
2008, Friedrich Neubarth, Michael Pucher, Christian Kranzler, Modeling Austrian dialect varieties for TTS. In Proceedings of the 9th Annual Conference of the International Speech Communication Association (INTERSPEECH 2008), pages 1877-1880, Brisbane, Australia. | |
2007 |
|
2007, Michael Pucher, Andreas Türk, Jitendra Ajmera, Natalie Fecher, Phonetic distance measures for speech recognition vocabulary and grammar optimization . In Proceedings of the 3rd congress of the Alps Adria Acoustics Association, Graz, Austria. | |
2007, Sebastian Möller, Klaus Peter Engelbrecht, Michael Pucher, Peter Fröhlich, Lu Huo, Ulrich Heute, Frank Oberle, TIDE: A testbed for interactive spoken dialogue system evaluation . In Proceedings of the XII International conference Speech and Computer (SPECOM 2007), Moscow, Russia. | |
2007, Michael Pucher, WordNet-based semantic relatedness measures in automatic speech recognition for meetings. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007), pages 129-132, Prague, Czech Republic. | |
2006 |
|
2006, Michael Pucher, Yan Huang, Özgür Çetin, Combination of latent semantic analysis based language models for meeting recognition. In Proceedings of the Second IASTED International Conference on Computational Intelligence (CI 2006), pages 465-469, San Francisco, USA. | |
2006, Michael Pucher, Yan Huang, Özgür Çetin, Optimization of latent semantic analysis based language model interpolation for meeting recognition. In Proceedings of the 5th Slovenian and 1st International Language Technologies Conference, pages 74-78, Ljubljana, Slovenia. | |
2005 |
|
2005, Michael Pucher, Peter Fröhlich, A user study on the influence of mobile device class, synthesis method, data rate and lexicon on speech synthesis quality. In Proceedings of the 9th European Conference on Speech Communication and Technology (EUROSPEECH 2005), pages 2501-2504, Lisboa, Portugal. | |
2005, Georg Niklfeld, Hermann Anegg, Michael Pucher, Raimund Schatz, Rainer Simon, Florian Wegscheider, Alexander Gassner, Michael Jank, Günther Pospischil, Device independent mobile multimodal user interfaces with the MONA Multimodal Presentation Server. In Proceedings of the Eurescom summit 2005 on Ubiquitous Services and Applications, Heidelberg, Germany. | |
2005, Michael Pucher, Performance evaluation of WordNet-based semantic relatedness measures for word prediction in conversational speech. In Proceedings of the 6th International Workshop on Computational Semantics (IWCS 6), pages 332-342, Tilburg, the Netherlands. | |
2005, Michael Pucher, Yan Huang, Latent semantic analysis based language models for meetings. 2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2005), Edinburgh, UK. | |
2004 |
|
2004, Hermann Anegg, Thomas Dangl, Michael Jank, Georg Niklfeld, Michael Pucher, Raimung Schatz, Rainer Simon, Florian Wegscheider, Multimodal interfaces in mobile devices - the MONA project . In Proceedings of the Workshop on Emerging Applications for Wireless and Mobile Access. 13th International World Wide Web Conference (WWW 2004), New York, USA. | |
2004, Lynne Baillie, Michael Pucher, Marian Kepesi, A multimodal mobile robot for the home . In Proceedings of the IADIS International Conference e-Society 2004, Avila, Spain. | |
2004, Lynne Baillie, Michael Pucher, Marian Kepesi, A supportive multimodal mobile robot for the home . In Lecture Notes in Computer Science (LNCS), volume 3196, pages 375-383. 8th ERCIM Workshop on User Interfaces for All, Vienna, Austria. | |
2003 |
|
2003, Michael Pucher, Friedrich Neubarth, Erhard Rank, Georg Niklfeld, Qi Guan, Combining non-uniform unit selection with diphone based synthesis. In Proceedings of the 8th European Conference on Speech Communication and Technology (EUROSPEECH 2003), pages 1329-1332, Geneva, Switzerland. | |
2003, Michael Pucher, Marian Kepesi, Multimodal Mobile Robot Control using Speech Application Language Tags. In Lecture Notes in Computer Science (LNCS), volume 2875, pages 56-64. European Symposium on Ambient Intelligence, Eindhoven, the Netherlands. | |
2003, Michael Pucher, Julia Tertyshnaya, Florian Wegscheider, Personal voice call assistant: SIP and VoiceXML in a distributed environment. In Proceedings of the Workshop on Emerging Applications for Wireless and Mobile Access. 12th International World Wide Web Conference (WWW 2003), Budapest, Hungary. | |
2002 |
|
2002,Georg Niklfeld, Michael Pucher, Robert Finan, Wolfgang Eckhart, Mobile multi-modal data services for GPRS phones and beyond. In Proceedings of the 4th IEEE International Conference on Multimodal Interfaces (ICMI 2002), Pittsburgh, USA. | |
2002,Georg Niklfeld, Michael Pucher, Robert Finan, Wolfgang Eckhart, Steps towards multi-modal data services in GPRS and in UMTS or WLAN networks . In Proceedings of the ISCA Tutorial and Research Workshop on Multi-Modal Dialogue in Mobile Environments, Irsee, Germany. | |
2001 |
|
2001, Georg Niklfeld, Robert Finan, Michael Pucher,Multimodal interface architecture for mobile data services. In Proceedings of the Workshop on Wearable Computing (TCMC 2001) , Graz, Austria. | |
2001, Georg Niklfeld, Robert Finan, Michael Pucher, Architecture for adaptive multimodal dialog systems based on VoiceXML. In Proceedings of the 7th European Conference on Speech Communication and Technology (EUROSPEECH 2001), pages 2341-2344, Aalborg, Denmark. | |
2001, Georg Niklfeld, Robert Finan, Michael Pucher, Component-based multimodal dialog interfaces for mobile knowledge creation. In Proceedings of the Workshop on Human Language Technology and Knowledge Management, pages 103-110. 39th Annual Meeting of the Association for Computational Linguistics (ACL 2001), Toulouse, France. | |
Theses |
|
2017, Michael Pucher, Speech processing for multimodal and adaptive systems, Habilitation thesis, Venia docendi in Speech Communication, Graz University of Technology. | |
2015, Michael Pucher, A Hidden-Markov-Model (HMM) based Opera Singing Synthesis System for German, Master thesis, Computer Science, Vienna University of Technology. | |
2007, Michael Pucher, Semantic Similarity in Automatic Speech Recognition for Meetings, Doctoral Thesis, Electrical and Information Engineering, Graz University of Technology. | |
2001, Michael Pucher, Formale Wahrheitstheorien nach Alfred Tarski, Diploma Thesis, Philosophy, University of Vienna. | |
Teaching |
|
Winter semester 2024/2025: Lecture on Speech Synthesis at Signal Processing and Speech Communication Laboratory at Graz University of Technology | |
Summer semester 2024: Lecture on Speech Technologies (Master in Multilingual Technologies) at Centre for Translation Studies at University of Vienna | |
Winter semester 2023/2024: Lecture on Speech Synthesis at Signal Processing and Speech Communication Laboratory at Graz University of Technology | |
Winter semester 2022/2023: Lecture on Speech Synthesis at Signal Processing and Speech Communication Laboratory at Graz University of Technology | |
Winter semester 2021/2022: Lecture on Sociophonetics at Institut für Germanistik at University of Vienna | |
Winter semester 2021/2022: Lecture on Speech Synthesis at Signal Processing and Speech Communication Laboratory at Graz University of Technology | |
Winter semester 2020/2021: Lecture on Sociophonetics at Institut für Germanistik at University of Vienna | |
Winter semester 2020/2021: Lecture on Speech Synthesis at Signal Processing and Speech Communication Laboratory at Graz University of Technology | |
Winter semester 2019/2020: Lecture on Cognitive User Interfaces at Institut für Information Systems Engineering at Vienna University of Technology | |
Winter semester 2019/2020: Lecture on Sociophonetics at Institut für Germanistik at University of Vienna | |
Winter semester 2018/2019: Seminar on Selected Topics Signal, Biosignal and Speech Processing at Signal Processing and Speech Communication Laboratory at Graz University of Technology | |
Winter semester 2018/2019: Laboratory Acoustics at Faculty of Physics at University of Vienna | |
Summer semester 2018: Lecture on Spoken language in human and human-computer dialogue at Signal Processing and Speech Communication Laboratory at Graz University of Technology | |
Winter semester 2017/2018: Lecture on Cognitive User Interfaces at Institute of Computer Languages at Vienna University of Technology | |
Winter semester 2017/2018: Laboratory Acoustics at Faculty of Physics at University of Vienna | |
Winter semester 2016/2017: Lecture on Computational Semantics at Institute of Computer Languages at Vienna University of Technology | |
Summer semester 2016: Lecture on Cognitive User Interfaces at Institute of Computer Languages at Vienna University of Technology | |
Summer semester 2015: Lecture on Cognitive User Interfaces at Institute of Computer Languages at Vienna University of Technology | |
Winter semester 2014/2015: Lecture on Computational Semantics at Institute of Computer Languages at Vienna University of Technology | |
Summer semester 2014: Lecture on Cognitive User Interfaces at Institute of Computer Languages at Vienna University of Technology | |
Winter semester 2013/2014: Lecture on Computational Semantics at Institute of Computer Languages at Vienna University of Technology | |
Summer semester 2013: Lecture on Cognitive User Interfaces at Institute of Computer Languages at Vienna University of Technology | |
Winter semester 2011/2012: Lecture on Cognitive User Interfaces at Institute of Computer Languages at Vienna University of Technology | |
July 2011: Seminar on Audio-Visual Speech Synthesis at the Signal Processing Laboratory (Aholab) of the University of the Basque Country | |
Summer semester 2008: Seminar on Speech Synthesis at the Signal Processing and Speech Communication Laboratory at Graz University of Technology | |
I examined the following PhD theses: | |
2022, Truc Nguyen, Robust Lung Sound and Acoustic Scene Classification. PhD thesis. Electrical and Information Engineering. Graz University of Technology. | |
2022, Michael Riccabona, Variation der Intonation in Nord- und Südtirol - Analysen zu extra- und innerlinguistischen Steuerungsfaktoren. Dissertation. Institut für Germanistik. Universität Wien. | |
2020, Mohammed Al-Radhi, High-Quality Vocoding Design with Signal Processing for Speech Synthesis and Voice Conversion. PhD thesis. Faculty of Electrical Engineering and Informatics. Budapest University of Technology and Economics. | |
2019, Matthias Zöhrer, Speech enhancement using deep neural beamformers. PhD thesis. Electrical and Information Engineering. Graz University of Technology. | |
I examined the following diploma theses: | |
2023, Fabio Ziegler, Refractory Modelling with Deep Neural Networks. Master's thesis. Electrical and Information Engineering. Graz University of Technology. | |
I supervised the following diploma theses: | |
2022, Lisa Kerle, Speaker interpolation based data augmentation for automatic speech recognition. Master's thesis. Electrical and Information Engineering. Graz University of Technology. | |
2019, Lorenz Gutscher, Recording, analysis, statistical modeling and synthesis of bird songs. Diploma thesis. Electrical and Information Engineering. Graz University of Technology. | |
I co-supervised the following PhD theses: | |
2016, Markus Toman, Acoustic modeling and transformation of varieties for speech synthesis. PhD thesis. Computer Science. Vienna University of Technology. | |
2014, Dietmar Schabus, Audiovisual speech synthesis based on hidden Markov models. PhD thesis. Electrical and Information Engineering. Graz University of Technology. | |
I co-supervised the following diploma theses: | |
2019, Carina Lozo, Synthese und prosodischer Transfer zweier südbairischer Dialekte - Sprachsynthese als Beispiel für ein interdisziplinäres Forschungsfeld der Linguistik. Masterarbeit. Sprachwissenschaft. Universität Wien. | |
2013, Jakob Hollenstein, Visual Control of Audio-Visual Speech Synthesis. Diploma thesis. Computer Science. Vienna University of Technology. | |
2009, Dietmar Schabus, Interpolation of Austrian German and Viennese Dialect / Sociolect in HMM-based Speech Synthesis. Diploma thesis. Computer Science. Vienna University of Technology. | |
2008, Christian Kranzler, Text-to-Speech Engine with Austrian German corpus. Diploma thesis. Electrical and Information Engineering. Graz University of Technology. | |
2008, Michael Bruss, Quantitative und phonetische Analyse von nicht-linguistischen Partikeln in spontan gesprochener Sprache der Wiener Soziolekte. Magisterarbeit. Phonetik. Universität des Saarlandes, Saarbrücken. | |
Curriculum Vitae |
|
Professional Experience |
|
Since 2024: Head of Technology Design at Recognosco GmbH, Vienna, Austria | |
Since 2022: Senior Researcher at Austrian Research Institute for Artificial Intelligence (OFAI), Vienna, Austria | |
2022 to 2024: Senior Speech Technologist at Recognosco GmbH, Vienna, Austria | |
Since 2018: Lecturer at University of Vienna | |
Since 2017: Adjunct Professor ((Priv.-Doz.) at Signal Processing and Speech Communication Laboratory (SPSC) at Graz University of Technology | |
2016 to 2022: Senior Research Scientist at Acoustics Research Institute (ARI), Austrian Academy of Sciences (ÖAW) | |
2011 to 2020: Lecturer at Vienna University of Technology (TUW) | |
2007 to 2015: Senior Researcher at the Telecommunications Research Center Vienna (FTW) | |
2001 to 2006: Researcher at the Telecommunications Research Center Vienna (FTW) | |
1999 to 2002: Software/database design and development with Java2/Oracle | |
1999: Teaching assistant at the Institute for Database Systems and Artificial Intelligence (DBAI) at Vienna University of Technology (TUW) | |
1989 to 1993: Worked as a chef in restaurants in Austria and Liechtenstein | |
Education |
|
2017: Habilitation (venia docendi) in Speech Communication at Graz University of Technology with a thesis on Speech Processing for Multimodal and Adaptive Systems | |
February to September 2017: Paternity leave | |
2015: Master degree (Dipl.-Ing.) in Computer Science from Vienna University of Technology | |
2010 to 2015: Master's studies in Computer Science (Computational Intelligence) at Vienna University of Technology | |
2007: Doctoral degree (Dr.techn.) in Electrical and Information Engineering (with distinction) from Graz University of Technology | |
2004 to 2007: Doctoral studies in Electrical Engineering (Speech Communication) at Graz University of Technology | |
2001: Diploma degree (Mag.phil.) in Philosophy (with distinction) from the University of Vienna | |
1995 to 2000: Diploma studies in Computer Science (Computational Logic) at Vienna University of Technology | |
1994 to 2001: Diploma studies in Philosophy, Logic, and Mathematics at University of Vienna | |
1994: Studienberechtigungsprüfung | |
1994: Studies in Interdisciplinary Art at Wiener Kunstschule | |
January to April 1992: French language course in Paris | |
1984 to 1988: Cook apprenticeship | |
1979 to 1984: High school in Judenburg, Austria | |
1975 to 1979: Primary school in Trieben, Austria | |
Research Visits |
|
April to May 2014: National Institute of Informatics (NII), Tokyo, Japan. | |
August to September 2008: Centre for Speech Technology Research (CSTR), University of Edinburgh, UK | |
August 2006: Telekom Innovation Laboratories (T-Labs), Berlin, Germany | |
February to July 2005: International Computer Science Institute (ICSI), Berkeley, California | |
Professional activities |
|
Organizing |
|
Organizing committee member of the 10th ISCA Speech Synthesis Workshop (SSW10) in 2019 in Vienna, Austria | |
Organizing committee member of the 3rd Workshop on the History of Speech Communication Research (HSCR) in 2019 in Vienna, Austria | |
Organizing committee member of INTERSPEECH 2019 in Graz, Austria | |
Workshop in Gedenken an Sylvia Moosmüller (1954 - 2018) in 2019 in Wien, Österreich | |
Organizing committee member of LabPhon16 Satellite event - New developments in Speech Sensing and Imaging (2018) in Lisbon, Portugal | |
Area chair for Speech Synthesis and Spoken Language Generation of INTERSPEECH 2015 in Dresden, Germany | |
Organizing committee member of FAAVSP 2015 - The 1st Joint Conference on Facial Analysis, Animation and Auditory-Visual Speech Processing in Vienna, Austria | |
Program committee member of ACM Multimedia 2014 in Brisbane, Australia | |
Organizing committee member of FAA 2012 - The ACM 3rd International Symposium on Facial Analysis and Animation in Vienna, Austria | |
Organizing committee member of ICAD 2005 Workshop - Combining Speech and Sound in the User Interface in Limerick, Ireland | |
Awards |
|
WINTEC 2016 Preis für Inklusion durch Wissenschaft und Technik (WINTEC 2016 prize for inclusion through science and technology) | |
Reviewing |
|
Speech Communication, Elsevier | |
Computer Speech and Language, Elsevier | |
IEEE Transactions on Audio, Speech, and Language Processing | |
IEEE Journal of Selected Topics in Signal Processing | |
IEEE Signal Processing Letters | |
IEEE Transactions on Systems, Man, and Cybernetics (Part B - Cybernetics) | |
Journal of the Acoustical Society of America | |
Cognitive Computation, Springer | |
Computer Methods and Programs in Biomedicine, Elsevier | |
The Computer Journal, Oxford University Press | |
Frontiers in Human-Media Interaction | |
Frontiers in Psychology, section Language Sciences | |
Memberships |
|
IEEE | |
ACM | |
International Speech Communication Association (ISCA) | |
European Network for the Advancement of Artificial Cognitive Systems, Interaction and Robotics (EUCOG III) | |
Cross-Modal Analysis of Verbal and Nonverbal Communication (COST 2102) | |
Media appearance |
|
November 2021: Ö1 feature: Wen Künstliche Intelligenz ersetzen kann | |
Oktober 2019: Makro Mikro #18: Sprechende Maschinen: Siri und Alexa reisen ins 18. Jahrhundert | |
12. September 2019: Science ORF: Vorläufer von Alexa, Siri und Co., Treffen der Kempelenschen Sprachmaschinen | |
2. August 2019: Ö1 Punkt eins: Singen um zu sprechen | |
6. Februar 2018: Krone der Wissenschaft: Der Dialektcomputer | |
17. Jänner 2017: Die eigene Stimme erhöht die Motivation | |
20. Juni 2014: Die Presse Artikel "Dialekt aus dem Computer" | |
6. Mai 2009: Interview im Online Standard zur Synthese von Soziolekten | |
30. April 2009: Online Standard Artikel: "He, Nudlaug, wos is?" - Computer spricht jetzt wienerisch | |
22. Juni 2008: Ö1 Sendung "matrix - computer & neue medien" zum Thema Wos host gsogt? - Sprachsynthese auf Wienerisch in oe1.ORF.at | |
June 2008: ORF-ON Science Artikel | |
August 2007: Projekt "Wiener Soziolekt und Dialektsynthese" wird für den "Wiener Zukunftspreis" 2007 nominiert | |
4. August 2007: "Wien Heute" Beitrag "Computer lernt Wienerisch" | |
März 2007: ORF Futurezone Artikel | |
März 2007: Online Standard Artikel | |