© 2002 Ola Jetlund
Logo Insitutt for teleteknikk Telenor FoU
[English] [Home] [Prosjektarkiv] [Prosjektbeskrivelse] [Samarbeidspartnere] [Personer]

Prosjektet FONEMA (gresk: lyd av tale) har som mål å frambringe en naturtro norsktalende datamaskinstemme ved å tilpasse tredj

FONEMA “Tools for realistic speech synthesis in Norwegian ”

Project FONEMA (greek for 'speech sound, utterance') , is a research project within the KUNSTI (Knowledge development for Norwegian Language Technology) framework and funded by the Research Council of Norway (NFR). The project  is a cooperation between the Institute for Electronics and Telecommunications  at the Norwegian University of Science and Technology (NTNU) and Telenor Research and development. The project period is from 2002 to 2006.

 

The motivation for the FONEMA project was a need to bring about the necessary basis for state-of-the-art unit selection speech synthesis for the Norwegian Language.  Current commercial unit selection TTS systems for the Norwegian language have been considered unsatisfactory for many commercial applications. One reason for this may be that basic knowledge of Norwegian phonetics and prosody has not been available in their design.

 

Unit selection synthesis is a technique where appropriate acoustic sub-word units are selected from multiple examples in a database of natural speech. This technique has been shown to produce high quality natural sounding speech. One interesting property of unit selection speech synthesis is the possibility to copy the natural speaking styles of the  the voice “donors”. A commercially interesting application of this property is the creation of custom made “personas” adapted to specific applications or company portals (corporate voices).

 

In concatenative  unit selection TTS  using large speech corpora, the main challenges is to  select a desired unit from a multitude of similar units and joining the selected units in the way that is most consistent with the desired utterance. The most important issues are:

 

  • Recording, automatic annotation and organisation of speech databases. 
  • Distance measures for comparing linguistic/phonetic information of the target input text and the properties of the speech units stored in the database,
  • Search algorithm to find appropriate units

 

 

FONEMA addresses all these aspects with a particular emphasis on the Norwegian language.

Project goals

The project will establish a research framework for high quality unit selection speech synthesis in Norwegian. The project deliverables are

 

  • A model  for Norwegian prosody in text to speech synthesis
  • Procedures the production of speech databases supporting speech styles for specific application domain.
  • A speech synthesis demonstrator for unit selection TTS with different speaking styles or dialects
  • Develop and build the necessary competencies in Norwegian phonetics, linguistics as speech technology. 

Contacts:

Persons taking part in the project

© 2002 Ola Jetlund