Working with Visionaries on the Frontlines of Scientific Progress Worldwide
Nuance Foundation Grants

Efficient Manifold-Constrained Discriminative Acoustic Modeling for Automatic Speech Recognition

Description: McGill University

 

With some 300 buildings, more than 37,500 students and 225,000 living alumni, and a reputation for excellence that reaches around the globe, McGill has carved out a spot among the world's great universities.

Project: Efficient Manifold-Constrained Discriminative Acoustic Modeling for Automatic Speech Recognition

Richard Rose

The proposed project will investigate techniques that facilitate more efficient use of speech data for acoustic model training in ASR. The first project goal is to enable efficient configuration of initial ASR systems in previously unseen languages. This will be done by employing techniques for leveraging speech data from multiple languages in order to configure an initial ASR system in a target under-resourced language. The second goal is the development of acoustic modeling formalisms that maximize ASR performance using a minimum of annotated speech data. Deep Neural Networks (DNNs, aka, DBNs) will be used to achieve this. The project will use existing speech data, much of which will be used for learning in an un-supervised way.

Publications:

Sina Hamidi Ghalehjegh and Richard Rose, “Regularized constrained maximum likelihood linear regression for speech recognition”, in the IEEE International Conference on Acoustics, Speech, and Signal Processing, Florence, Italy, May, 2014.

Sina Hamidi Ghalehjegh and Richard Rose, “Two-stage speaker adaptation in subspace Gaussian mixture models”, in the IEEE International Conference on Acoustics, Speech, and Signal Processing, Florence, Italy, May, 2014.

Aanchan Mohan, Richard Rose, Sina Hamidi Ghelehjegh, S Umesh. “Acoustic Modelling for speech recognition in Indian languages in an agricultural commodities task domain”, Special Issue on Processing Under-resourced Languages, Speech Communication. January, 2014, Elsevier.”

Programs & Grants

100 years ago nobody would have imagined that it may make sense to talk to machines. Today, in the days of speech recognition and speech synthesis to be found in cars, computers, phones and many other devices this is already normal. But it doesn't stop there.

Learn More