Top of page
go to main navigation
go to sub navigation
go to main content
Meraka Institute

   
start of sub navigation
HLT Home | People | Research | Collaborators | Projects | Publications
end of sub navigation

start of content

Human Language Technologies (HLT) - Speech Recognition

Automatic speech recognition gives computer-based systems the ability to understand human speech. Although people have dreamt of computers that understand natural speech for a long time, that goal is still some way off. Currently, the best that we can do is to understand carefully spoken speech on a limited set of topics by any speaker (speaker-independent speech recognition), or speech by one speaker only on a wider set of topics (speaker-independent speech recognition).

Despite these limitations, speech recognition is already of great practical value. People who are not able to type can, for example, use speaker-dependent recognition to operate computers with their voice instead of a keyboard. And speaker-independent recognition forms the core of computer systems that interact with people over the telephone, giving them the ability to obtain information or perform transactions through spoken interaction. This capability has great potential in the developing world, where the unavailability of extensive computer networking infrastructure, and limited technological literacy, are significant contributors to the digital divide.

Our research group focuses on speaker-independent recognition, principally for telephone-based applications. In collaboration with the University of the North, for example, we have developed an initial version of a speaker-independent recognizer for the Sepedi dialect of Northern Sotho. This system, which is based on data collected by J. Manamela, recognizes words in terms of their phonetic decomposition, and was developed using the HTK toolkit from Cambridge University.

In collaboration with Intelleca Voice and Mobile we are also developing recognition modules for various South African languages. These modules will operate within the commercial Open Speech Recognizer of Scansoft, and are intended for applications such as banking or travel reservations.

Although speech recognition has been the focus of intensive research for several decades, it is still not clear that we have found the optimal approach to this surprisingly demanding task. In our research group we are studying alternative approaches to speech recognition. Willie Smit is, for example, looking at ways to use temporally staggered events as the basis for acoustic recognition; a preliminary report on his work is available.

 

Kagiso Chikane providing some linguistic advice to the research team

 

The research team (from left to right): Tebogo Modiba, Marelie Davel, Etienne Barnard, Jonas Manamela and Kope Mamadisa

Links

First Sepedi Speech Recognition system developed at the CSIR
Speech recognition SA's top innovation

   
  Contact: Marelie Davel +27 12 841 2466 mdavel@csir.co.za
   
Copyright © Meraka Institute 2007
Bottom of page