Project T2

3rd Phase - Information structure in speech synthesis

In automatic speech synthesis, considerable progress was made over the past years, but the dominant paradigm of text-to-speech synthesis still shows deficits when the prosodic structure of an utterance is in non-trivial ways context-dependent. A central problem here is the lacking treatment of information structure (IS): Often, given information cannot be distinguished from new; focused or contrasting elements are not being signalled as such. Our project aims at improving speech synthesis by systematically incorporating such discourse-based information. In order to circumvent the (extremely difficult) task of automatic text analysis on the IS level, we will work with an existing system that automatically generates text: Given a database and a user query, it dynamically produces descriptions and comparisons of suitable products (here: textbooks on computational linguistics). In such a setting, it is possible to compute the degree of activation of discourse referents, and features of contrastiveness. Our first task thus is to determine IS annotations for the generated sentences, i.e., to label them with features for givenness, topicality, and focus. To this end, we will develop a scheme for discourse modelling (which should be largely independent of the specific application domain), and an algorithm for computing the IS parameters of the following sentence in the given context. These IS features are then mapped to a prosodic annotation in accordance with the GToBI scheme. The tonal structure has to be optimized for parameters such as givenness and contrast, which determine, amongst others, the size of the focus domain, word accentuation and deaccentuation, or structural features such as positioning the nuclear accent of the sentence. The intonation contour of the sentence will be computed by combining the tonal annotation and the information structure; this algorithm will also be in charge of selecting the pitch register, relative to which the various pitch accents will be scaled. As a speech synthesis module, we will use the MARY system developed by DFKI GmbH (Saarbrücken). DFKI will be a partner in this project and support us in making the necessary additions and adjustments to MARY. The second external partner is beyo GmbH (Potsdam), which will help with the practical evaluation of our synthesis results and determine the possible transfer of the results to applications such as web page voice reading.


Principal Investigators

  • Prof. Dr. Manfred Stede
  • Dr. Frank Kügler

Former Staff Members

  • Bernadett Smolibocki

Student Assistants

  • Leonard Kriese
  • Sybille Kiziltan

Activities

June 2015 Talk Kügler, F., Smolibocki, B., Stede, M.: Prominence relations between discourse referents as a factor for the analysis of the prosodic marking of information status - a corpus study of spoken German. International Conference 'Prominence in Language', Universität Köln
May 2015 Talk Kügler, F.: The prosodic expression of focus - Between language-specifity and cross-linguistic similarities. Advances in Information Structure Research 2003 - 2015, Berlin.
May 2015 Poster Kügler, F., Smolibocki, B., Stede, M.: Information status and prosody in a corpus of non-scripted spoken German. Final Conference of the SFB 632 `Information Structure': Advances in Information Structure Research 2003 - 2015, Berlin
March 2015 Workshop DIMA-V Deutsche Intonation - Modellierung und Annotation V. Potsdam.
March 2015 Talk Kügler, F.: Focal lowering in German interrogatives. DGfS 2015, AG 6 `The prosody and meaning of (non-) canonical questions across languages', Leipzig
December 2014 Talk Kügler, F. & Smolibocki, B.: Variation in the prosodic marking of information status of discourse referents in German - a corpus study. Forschungskolloquium Phonetik, Universität Köln.
December 2014 Talk Kügler, F.: Information structure - basic concepts and its realization in a cross-linguistic perspective. Invited talk at the workshop "Theoretical and Empirical Perspectives on the Interrelation of Syntax, Semantics and Prosody", Universität Köln.
June 2014 Talk Kügler, F., Smolibocki, B. & Stede, M.: Information structure and prosody in advisory monologues of German. 19th. Internal Workshop of the SFB 632, Wandlitz
June 2014 Poster Kügler, F.: Focus affects the pitch register – focal lowering in German. International Symposium on Prosody to Commemorate Gösta Bruce, Lund. Download
March 2014 Workshop Kügler, F., Smolibocki, B.: DIMA-IV (Deutsche Intonation - Modellierung und Annotation). Kiel
March 2014 Poster Kügler, F., Smolibocki, B. & Stede, M.: Information status and prosody in a corpus of non-scripted spoken German. Linguistic Evidence 2014, Tübingen. Download
January 2014 Workshop Kentner, G., Ishihara, S., Feldhausen, I. & Kügler, F.: Prosody and Information Structure. Workshop, Goethe University Frankfurt.
October 2013 Poster Smolibocki, B.: Prosodische Annotation gesprochener Sprache. P&P-9, Zürich.
June 2013 Talk Kügler, F., Smolibocki, B. & Stede, M.: Information status and prosody in advisory dialogues of German. 17th. Internal Workshop of the SFB 632, Wittenberg
June 2013 Talk Kügler, F., Smolibocki, B. & Stede, M.: Intonation of information status and prosody in advisory dialogues of German. ISSLaC, Bielefeld
May 2013 Talk Kügler, F. & Féry, C.: Post-focal realization in German. DIMA-III, ZAS Berlin
March 2013 Workshop Stede, M.: Rethinking (the Annotation of) Anaphora and Information Structure (RAIS). Universität Stuttgart
March 2013 Talk Kügler, F., Smolibocki, B., Stede, M. & Varges, S.: Information structure in speech synthesis: early focus and post-focal givenness. ESSV 2013, Bielefeld
October 2012 Talk Kügler, F., Smolibocki, B. & Stede, M.: Evaluation of synthesized intonation with respect to information structure. P&P-8 Jena
September 2012 Poster Kügler, F., Smolibocki, B. & Stede, M.: Evaluation of Information Structure in Speech Synthesis: The Case of Product Recommender Systems. ITG Conference on Speech Communication, Braunschweig
September 2012 Talk Smolibocki, B.: Synthesized speech with respect to information structure. StaPs Bochum
August 2012 Meeting Project T2 Meeting DFKI (co-operation partner), Berlin
June 2012 Talk Kügler, F., Smolibocki, B. & Stede, M.: Information structure in speech synthesis. 16th. Internal Workshop of the SFB 632, Wandlitz
June 2012 LNdW Stede, M., Kügler, F., Varges, S., Smolibocki, B., Peldszus, A., Pusch, A.: Suchst Du noch oder telefonierst Du schon? Handy-Auswahl leicht gemacht. Lange Nacht der Wissenschaften, Potsdam
January 2012 Meeting Project T2 Meeting DFKI (co-operation partner), Saarbrücken