Invited Speakers

Jörn Ostermann (Leibniz Universität Hannover, Laboratory for Information Technology)


Jörn Ostermann

Jörn Ostermann studied Electrical Engineering and Communications Engineering at the University of Hannover and Imperial College London, respectively. He received Dipl.-Ing. and Dr.-Ing. from the University of Hannover in 1988 and 1994, respectively. From 1988 till 1994, he worked as a Research Assistant at the Institut für Theoretische Nachrichtentechnik conducting research in low bit-rate and object-based analysis-synthesis video coding. In 1994 and 1995 he worked in the Visual Communications Research Department at AT&T Bell Labs on video coding. He was a member of Image Processing and Technology Research within AT&T Labs - Research from 1996 to 2003. Since 2003 he is Full Professor and Head of the Institut für Informationsverarbeitung at the Leibniz Universität Hannover, Germany. In 2007, he became head of the Laboratory for Information Technology. His current research interests are video coding and streaming, 3D modeling, face animation, and computer-human interfaces.

Keynote: Facial Animation for Interactive Services

Facial animation can be used to enhance interactive as well as broadcast services. In this lecture, we present the technology and architecture in order to use this talking faces in an web-based environment to support education, entertainment and e-commerce applications. The speech associated with the talking face may be recorded speech, a life audio stream or synthesized speech. In order to animate the talking head, its mouth and its eyes, information like phonemes, their timing as well as emphasis of syllables need to be known. In case of a speech synthesizer, this information is readily available. For natural speech, the information has to be extracted from the sound track. There are 2 basic technologies that are used to render talking faces: 3D face models as described in MPEG-4 may be used to provide the impression of a talking cartoon or human-like character. Sample-based face models generated from recorded video enable the synthesis of a talking head that cannot be distinguished from a real person. Depending on the chosen face animation technology and latency requirements, different architectures for delivering the talking head over the Internet are required for interactive applications. Video communication using MPEG-4 facial animation parameters can be enabled at a rate below 2 kbit/s. However, these facial animation parameters need to be estimated reliably in real-time using sophisticated computer vision algorithms. Since interactive computer human interfaces do not necessarily need to rely on computer vision, the early adopters of facial animation are found in the areas of entertainment, education, automated customer service and shopping.




Mark Pauly (École Polytechnique Fédérale de Lausanne (EPFL), Switzerland)


Mark Pauly

Mark Pauly is an associate professor at the School of Computer and Communication Sciences at EPFL. Prior to joining EPFL, he was assistant professor at the CS department of ETH Zurich since April 2005. From August 2003 to March 2005 he was a postdoctoral scholar at Stanford University, where he also held a position as visiting assistant professor during the summer of 2005. He received his Ph.D. degree in 2003 from ETH Zurich. His research interests include computer graphics and animation, geometry processing, shape modeling and analysis, and architectural geometry. He is one of the co-founders of faceshift AG, an EPFL startup developing software for facial performance capture and animation.

Keynote: Performance-Driven Facial Animation

In this talk I will examine some the technical challenges in creating compelling facial animations for realtime applications. While professional acquisition systems for facial performances are widely used in offline film and game production, performance-driven facial animation for consumer-level applications is just emerging. The requirements of a cheap, easy-to-use hardware setup in combination with realtime processing and animation provide a challenging design space for building practical systems. I will discuss some of the recent research findings that went into the construction of such a system and show its performance during a live demo. Some thoughts and speculations about future developments in facial animation will conclude the talk.




Thabo Beeler (Disney Research Zurich)


Keynote: Passive Spatio-Temporal 3D Geometry Reconstruction of Human Faces at Very High Fidelity

The creation of realistic synthetic human faces is one of the most important and at the same time also most challenging topics in computer graphics. The high complexity of the face as well as our familiarity with it renders manual creation and animation impractical. The method of choice is thus to capture both shape and motion of the human face from the real life talent. To date, this is accomplished using active techniques which either augment the face either with markers or project specific illumination patterns onto it. Active techniques currently provide the highest geometric accuracy, but they have severe shortcomings when it comes to capturing performances. In this talk I will present an entirely passive and markerless system to capture and reconstruct facial performances at un-preceded spatio-temporal resolution. The proposed algorithms compute the facial shape and motion at skin-pore resolution from multiple cameras producing per frame temporally compatible geometry.