DFKI Document-07-02



Language: English

by Marc Schröder, Anton Batliner, Christophe d'Allessandro (Eds.)

ParaLing 07 Proceedings of the International Workshop on Paralinguistic Speech - between Models and Data

82 Pages


Research on various aspects of paralinguistic speech has gained considerable importance in recent years. It is nowadays generally acknowledged that vocal communication conveys much more information than the linguistic content in the narrow meaning of words, syntax and semantics: speech is worth studying well beyond the level of words. Domains of paralinguistic speech The domains and functions of paralinguistic speech are manifold. A first domain is expressive speech, i.e. all the voluntary of involuntary features linked to expression, emotions, attitudes, moods, affects, etc. A second domain concerns the speakers themselves, their age, gender, vocal health, and other speaker related paralinguistic information. In a broader sense, phenomena such as foreign accent, mutual influences of different languages, disfluencies, etc. are paralinguistic (or extralinguistic) as well. Models vs. data Very roughly, two strands can be distinguished in research on paralinguistic speech. On the one hand, models have been proposed for describing and modifying voice quality and prosody, related to factors such as emotional states or personality. Such models often start with high-intensity states (e.g., full-blown emotions) in clean lab speech, and are difficult to generalise to everyday speech. On the other hand, systems have been built to work with moderate states in real-life data, e.g. for the recognition of speaker emotion, age, or gender. Such models often rely on statistical methods, and are not necessarily based on any theoretical models. While both research traditions are obviously valid and can be justified by their different aims, it seems worth asking whether there is anything they can learn from each other. For example: “Can models become more robust by incorporating methods used for dealing with real-life data?”; “Can recognition rates be improved by including ideas from theoretical models?”; “How would a database need to be structured so that it can be used for both, research on model-based synthesis and research on recognition?” etc. Not all but some of the questions depicted above are addressed in papers at this workshop. Despite a summer 2007 overloaded with conferences and workshops on various aspects of speech processing, phonetics, and acoustics, we are happy to welcome a bunch of very interesting papers. We hope that the workshop will be an ideal place for fostering fruitful discussions, exchanging ideas and results, and starting new research collaborations. Finally, it is the pleasure of the organising committee to warmly thank the contributors to this workshop, in the first place the authors, and also the members of the Programme Committee who provided us with very detailed, positive and useful comments on the submitted manuscripts. We wish you all a rewarding workshop on paralinguistic speech! Marc Schröder, Anton Batliner and Christophe d'Alessandro

