Cambridge University: Cambridge Language Sciences


Cambridge Language Sciences is a Strategic Research Initiative of the University of Cambridge. This multi-disciplinary research community includes over 150 Principal Investigators, many of whom are considered world experts in their field.

Project: Learning Type-Driven Distributed Representations of Language

Stephen Clark

Combining Distributional and Compositional Semantic Models: Despite their widespread use, vector based models are typically directed at representing words in isolation and methods for constructing representations of larger phrases or sentences have received little attention in the literature (until recently). Yet, modelling the meanings of sentences and documents is critical in many applications. In earlier projects in this area a complete theory of how to build sentence vectors, within a type-driven tensor-based framework using the grammar formalism Combinatory Categorial Grammar (CCG), has been developed. However, there are still a number of open questions relating to the nature of the sentence space, and how the tensor-based representations can be learned from data, which this project will focus on.

