Top of page
go to main navigation
go to sub navigation
go to main content
Meraka Institute

   
start of main navigation
end of main navigation
start of sub navigation
HLT Home | People | Research | Collaborators | Projects | Publications
end of sub navigation

start of content

Human Language Technologies (HLT) - DictionaryMaker

The availability of language resources is key when developing any speech or language technology in a new language. These resources are expensive to generate and require significant expertise, and can become unaffordable when repeated a number of times for a number of languages. The DictionaryMaker project demonstrates how bootstrapping can be used to develop an electronic pronunciation dictionary – one of the key resources when developing speech recognition or speech synthesis systems – in a fraction of the time typically required, by combining interaction with a first language speaker of the specific language with a machine learning approach. The tool guides the user through the dictionary creation process, which can be completed in a fraction of the time it usually takes to create a pronunciation dictionary. Along with the dictionary, a related set of grapheme-to-phoneme rules is created automatically.

The DictionaryMaker was used to create the pronunciation rules for the isiZulu Text-to-Speech System and the Sepedi Speech Recognition System

Publications

M. Davel and E. Barnard, "The Efficient Generation of Pronunciation Dictionaries: Human Factors during Bootstrapping". In Proceedings of the 8th International Conference on Spoken Language Processing, Korea, 2004.

M. Davel and E. Barnard, "The Efficient Generation of Pronunciation Dictionaries: Machine Learning Factors during Bootstrapping", In Proceedings of the 8th International Conference on Spoken Language Processing, Korea, 2004".

M. Davel and E. Barnard, "Bootstrapping in HLT Resource Generation", Proceedings of the Pattern Recognition Association of South Africa Symposium, November 2003".

   
  Contact: Marelie Davel +27 12 841 2466 mdavel@csir.co.za
   
Copyright © Meraka Institute 2007
Bottom of page