Project T2
3rd Phase - Information structure in speech synthesis
In automatic speech synthesis, considerable progress was made over the past years, but the dominant paradigm of text-to-speech synthesis still shows deficits when the prosodic structure of an utterance is in non-trivial ways context-dependent. A central problem here is the lacking treatment of information structure (IS): Often, given information cannot be distinguished from new; focused or contrasting elements are not being signalled as such. Our project aims at improving speech synthesis by systematically incorporating such discourse-based information. In order to circumvent the (extremely difficult) task of automatic text analysis on the IS level, we will work with an existing system that automatically generates text: Given a database and a user query, it dynamically produces descriptions and comparisons of suitable products (here: textbooks on computational linguistics). In such a setting, it is possible to compute the degree of activation of discourse referents, and features of contrastiveness. Our first task thus is to determine IS annotations for the generated sentences, i.e., to label them with features for givenness, topicality, and focus. To this end, we will develop a scheme for discourse modelling (which should be largely independent of the specific application domain), and an algorithm for computing the IS parameters of the following sentence in the given context. These IS features are then mapped to a prosodic annotation in accordance with the GToBI scheme. The tonal structure has to be optimized for parameters such as givenness and contrast, which determine, amongst others, the size of the focus domain, word accentuation and deaccentuation, or structural features such as positioning the nuclear accent of the sentence. The intonation contour of the sentence will be computed by combining the tonal annotation and the information structure; this algorithm will also be in charge of selecting the pitch register, relative to which the various pitch accents will be scaled. As a speech synthesis module, we will use the MARY system developed by DFKI GmbH (Saarbrücken). DFKI will be a partner in this project and support us in making the necessary additions and adjustments to MARY. The second external partner is beyo GmbH (Potsdam), which will help with the practical evaluation of our synthesis results and determine the possible transfer of the results to applications such as web page voice reading.
Principal Investigators
- Prof. Dr. Manfred Stede
- Dr. Frank Kügler
Former Staff Members
- Bernadett Smolibocki
Student Assistants
- Leonard Kriese
- Sybille Kiziltan
Activities
June 2015 | Talk | Kügler, F., Smolibocki, B., Stede, M.: Prominence relations between discourse referents as a factor for the analysis of the prosodic marking of information status - a corpus study of spoken German. | International Conference 'Prominence in Language', Universität Köln | |
May 2015 | Talk | Kügler, F.: The prosodic expression of focus - Between language-specifity and cross-linguistic similarities. | Advances in Information Structure Research 2003 - 2015, Berlin. | |
May 2015 | Poster | Kügler, F., Smolibocki, B., Stede, M.: Information status and prosody in a corpus of non-scripted spoken German. | Final Conference of the SFB 632 `Information Structure': Advances in Information Structure Research 2003 - 2015, Berlin | |
March 2015 | Workshop | DIMA-V Deutsche Intonation - Modellierung und Annotation V. | Potsdam. | |
March 2015 | Talk | Kügler, F.: Focal lowering in German interrogatives. | DGfS 2015, AG 6 `The prosody and meaning of (non-) canonical questions across languages', Leipzig | |
December 2014 | Talk | Kügler, F. & Smolibocki, B.: Variation in the prosodic marking of information status of discourse referents in German - a corpus study. | Forschungskolloquium Phonetik, Universität Köln. | |
December 2014 | Talk | Kügler, F.: Information structure - basic concepts and its realization in a cross-linguistic perspective. | Invited talk at the workshop "Theoretical and Empirical Perspectives on the Interrelation of Syntax, Semantics and Prosody", Universität Köln. | |
June 2014 | Talk | Kügler, F., Smolibocki, B. & Stede, M.: Information structure and prosody in advisory monologues of German. | 19th. Internal Workshop of the SFB 632, Wandlitz | |
June 2014 | Poster | Kügler, F.: Focus affects the pitch register – focal lowering in German. | International Symposium on Prosody to Commemorate Gösta Bruce, Lund. | Download |
March 2014 | Workshop | Kügler, F., Smolibocki, B.: DIMA-IV (Deutsche Intonation - Modellierung und Annotation). | Kiel | |
March 2014 | Poster | Kügler, F., Smolibocki, B. & Stede, M.: Information status and prosody in a corpus of non-scripted spoken German. | Linguistic Evidence 2014, Tübingen. | Download |
January 2014 | Workshop | Kentner, G., Ishihara, S., Feldhausen, I. & Kügler, F.: Prosody and Information Structure. | Workshop, Goethe University Frankfurt. | |
October 2013 | Poster | Smolibocki, B.: Prosodische Annotation gesprochener Sprache. | P&P-9, Zürich. | |
June 2013 | Talk | Kügler, F., Smolibocki, B. & Stede, M.: Information status and prosody in advisory dialogues of German. | 17th. Internal Workshop of the SFB 632, Wittenberg | |
June 2013 | Talk | Kügler, F., Smolibocki, B. & Stede, M.: Intonation of information status and prosody in advisory dialogues of German. | ISSLaC, Bielefeld | |
May 2013 | Talk | Kügler, F. & Féry, C.: Post-focal realization in German. | DIMA-III, ZAS Berlin | |
March 2013 | Workshop | Stede, M.: Rethinking (the Annotation of) Anaphora and Information Structure (RAIS). | Universität Stuttgart | |
March 2013 | Talk | Kügler, F., Smolibocki, B., Stede, M. & Varges, S.: Information structure in speech synthesis: early focus and post-focal givenness. | ESSV 2013, Bielefeld | |
October 2012 | Talk | Kügler, F., Smolibocki, B. & Stede, M.: Evaluation of synthesized intonation with respect to information structure. | P&P-8 Jena | |
September 2012 | Poster | Kügler, F., Smolibocki, B. & Stede, M.: Evaluation of Information Structure in Speech Synthesis: The Case of Product Recommender Systems. | ITG Conference on Speech Communication, Braunschweig | |
September 2012 | Talk | Smolibocki, B.: Synthesized speech with respect to information structure. | StaPs Bochum | |
August 2012 | Meeting | Project T2 Meeting DFKI (co-operation partner), Berlin | ||
June 2012 | Talk | Kügler, F., Smolibocki, B. & Stede, M.: Information structure in speech synthesis. | 16th. Internal Workshop of the SFB 632, Wandlitz | |
June 2012 | LNdW | Stede, M., Kügler, F., Varges, S., Smolibocki, B., Peldszus, A., Pusch, A.: Suchst Du noch oder telefonierst Du schon? Handy-Auswahl leicht gemacht. | Lange Nacht der Wissenschaften, Potsdam | |
January 2012 | Meeting | Project T2 Meeting DFKI (co-operation partner), Saarbrücken |