Université de Montreal: Scaling Up Deep Learning
This project is focused on methods aimed at scaling up deep learning, on three fronts: better numerical optimization (in particular to deal better with saddle points, which may end up being the major stumbling block for current training procedures), conditional computation (where, for each example, only a subset of the model is activated), and distributed training (across many computing nodes).
Université de Montreal: Deep Structured Output Models
This project is focused on the expansion and improvement of deep learning techniques beyond simple classification and into the realm of structured output learning tasks.
Cambridge University: Learning Type-Driven Distributed Representations of Language
Modelling the meanings of sentences and documents is critical in many applications. In earlier projects in this area a complete theory of how to build sentence vectors, within a type-driven tensor-based framework using the grammar formalism Combinatory Categorial Grammar (CCG), has been developed. However, there are still a number of open questions relating to the nature of the sentence space, and how the tensor-based representations can be learned from data, which this project will focus on.
MIT: Unsupervised Learning of Invariant Word Representations: Machine Learning of Speech Like Children Do
Unsupervised Learning of Invariant Word Representations: Machine Learning of Speech Like Children Do
Université de Montreal RALI: Domain specific knowledge extraction from unstructured text with or without semantic databases
The project is located in the domain of automatically learning huge numbers of facts (like "Shakespeare is an author" or "Shakespeare wrote Romeo and Juliet") from unstructured text, and will try to improve the current state of the art in three direction.
University of Groningen (NL): Parsing Algorithms for Uncertain Input
The automated analysis of natural language is an important ingredient for future applications which require the ability to understand natural language. For carefully edited texts current algorithms now obtain good results.
Harvard: Intelligent Agents to Support Health Care Coordination and Communication
Teamwork and effective mechanisms for care coordination are of increasing importance to health care delivery and patient safety and health. This proposal describes a project aimed at enabling intelligent agent systems to be collaborative partners of patients and their health care pr+B1oviders supporting both care coordination and improved communication of medical information to patients and their families.
University of Stuttgart: Structurally informed methods for improved sentiment analysis
The task of sentiment analysis is to automatically identify the opinions people express in natural language about some topic (such as a product or a work of art). For instance, the sentence "auto white balance is better than of the other canon cameras I have and they already do very-well" from a product review expresses a positive opinion about a specific camera.
University of Stuttgart: Wikipedia-Based Named Entity Identification
One of the central tasks of natural language understanding is to identify the real world entity that a given linguistic expression is referring to. Solving this problem has become more realistic with the advent of machine readable repositories such as Wikipedia which provide unique identifiers of a large number of real world entities.
University of Rochester: Automatic Acquisition of a Deep Semantic Lexicon
While researchers agree that deep understanding will require combining natural language processing with knowledge and reasoning systems, very little effort is currently being made to bring these areas together in a serious way. One main reason for this is a lack of extensive semantic lexical resources that connect words to knowledge bases..
University of Edinburgh: Automated Recognition of Concurrent Discourse Relations
This project focuses on the fact that discourse adverbials (instead, after all...) often express not only their own discourse relations but also the relations that would typically be expressed by discourse conjunctions (but, so...). The fact that these adverbials express two relations concurrently goes contrary to the conventional notion that a single discourse marker would convey a single relationship. The ability to recognize and interpret concurrent relations is important for accuracy in tasks such as translation, summarization, and question answering. The project will create a corpus as basis for future research, a tagging system and a test on cross-domain performance.
UC Santa Cruz: Learning Generation Dictionaries for Dialogue Interaction
Recently there has been an explosion in applications for dialogue interaction ranging from direction-giving and tourist information to interactive story systems. Yet the natural language generation (NLG) component of most interactive dialogue systems remains largely handcrafted. This limitation greatly restricts the range of applications for dialogic interaction. The project proposes that a solution to this problem lies in new methods for developing language generation resources for new domains by automatically learning new ways to express system dialogue goals.
University of Constance: Tense and Aspect in Multilingual Semantic Construction
The project is motivated by the fact that the conventional approaches to tense and aspect, both theoretical and computational, are based primarily on the properties of European languages and in some cases on the special characteristics of English. Consideration of a broader range of typologically diverse languages shows that the morphosyntactic encodings of these notions and their semantic interpretations are much more varied, and are not well covered by current theories and implementations. The project will do a systematic study of these and related phenomena, based on the existing computational grammars for a collection of languages that have been developed over the last 20 years by participants in the Pargram project.
MUSAE Lab Montreal: Blind Room Acoustics Characterization for Improved Far-Field Voice-Based Human-Machine Interfaces
The project aims at the problem of degradation of speech technology due to noisy environments. This effect occurs mainly due to two factors: ambient noise and room reverberation. In practice, advanced speech enhancement methods and algorithms are used to overcome this limitation, such as microphone arrays and beam formers. Existing algorithms, however, are unaware of the surrounding environment (e.g., ambient noise level and type or reverberation levels), thus achieve sub-optimal results. This project aims to fill this gap and will develop the building blocks needed to enable adaptive environment-aware speech enhancement. More specifically, it is proposed to use an auditory-inspired speech signal representation that has been shown to decouple speech and room acoustics components.
RWTH Aachen University: Decipherment-Based Machine Translation
Conventional statistical machine translation (MT) systems require a large set of (source, target) sentence pairs for training. Such a bilingual corpus is not available for all language pairs (and not for all domains). On the other hand, a huge amount of monolingual data is available for many languages so that a high-quality language model can be learned. From this point of view, the training problem for statistical machine translation can be reformulated as a decipherment problem: we are given a (maybe very) large set of sentences in the source language, and we want to find the associated translations in the target language, whose language model is given. Due to the implicit word-to-word correspondences between the two languages, this formulation amounts to a combinatorial problem, in which both the (source word, target word)-pairs and their probabilities are unknown.
Progress and final reports for our grant projects are available upon request.
Nuance Foundation News
No recent news at this time.