Top of page
go to main navigation
go to sub navigation
go to main content
Meraka Institute

   
start of sub navigation
HLT Home | People | Research | Collaborators | Projects | Publications
end of sub navigation
start of content

Human Language Technologies (HLT) – Automatic Speech Recognition

Automatic speech recognition (ASR) systems provide computers with the capability to process and “understand” human speech. In two specific areas the technology has been very successful: to understand speech spoken by any speaker, but on a limited set of topics (speaker-independent speech recognition), or to understand speech from a much wider range of topics but for one speaker only (speaker-dependent speech recognition).

Both these types of ASR systems have many applications that are of great practical value. Speaker-dependent ASR, for example, is used in dictation systems and allows people who are not able to type to operate computers with their voice instead of a keyboard. Speaker-independent recognition forms the core of computer systems that interact with people over the telephone, giving them the ability to obtain information or perform transactions through spoken interaction. This capability has great potential in the developing world, where the unavailability of extensive computer networking infrastructure, and limited technological literacy, are significant contributors to the digital divide.

Our research group focuses on speaker-independent speech recognition, principally for telephone-based applications. Most of our current research focusses on building Hidden Markov Model (HMM) based systems with very limited speech resources. As part of project Lwazi we have collected speech corpora in all South Africa's official languages. While the most extensive freely available ASR corpus collected for African languages to date, the annotated corpus of 2 hours per language is a much smaller resource than typically available for speech recognition. This requires innovative techniques in order to utilise the data effectively. Some of the current research being undertaken in the group, includes:

  • Techniques to share data across languages.
  • Techniques to evaluate language distances, in order to better understand the practical (rather than historical) relationships among languages and dialects.
  • Techniques to combine data collected under very different channel conditions.
  • The development of ASR-builder, a research tool that supports the rapid development of ASR systems in new languages.

The ASR group collaborates widely, working with the University of the North (the development of the first speaker-independent recognizer for Sepedi), the University of Stellenbosch (ASR for language learning), North-West University (ASR resource collection) and Intelleca Voice and Mobile (support for the development of commercial speech recognition systems).

Links:

First Sepedi Speech Recognition system developed at the CSIR

Speech recognition SA's top innovation

ASR-builder: Tool for the development of HTK-based ASR systems

   
  Contact: Marelie Davel +27 12 841 2466 mdavel@csir.co.za
   
Copyright © Meraka Institute 2007
Bottom of page