STORM: an unsupervised connectionist model for language acquisition

McQueen, T, 2005. STORM: an unsupervised connectionist model for language acquisition. PhD, Nottingham Trent University.

[thumbnail of 10183172.pdf]
Preview
Text
10183172.pdf - Published version

Download (23MB) | Preview

Abstract

Language acquisition is one of the core problems in artificial intelligence. Current performance bottlenecks in natural language processing (NLP) systems result from a prerequisite for an incalculable amount of language and domain-specific knowledge. Consequently, the creation of an automated language acquisition system would revolutionize the field of NLP. Connectionist models that learn by example (i.e. artificial neural networks) have been successfully applied to many areas of language acquisition. However, the most widely used class of these models, known as supervised connectionist models, have a number of major limitations, including an inability to represent variables and a limited ability to generalize from sparse data. Such limitations have prevented connectionist models from being applied to large-scale language acquisition.

This research considers the alternative and less widely used class of unsupervised connectionist models and investigates whether such models can capture the finite-state properties of language. A novel unsupervised connectionist model, STORM (Spatio Temporal Self-Organizing Recurrent Map), is proposed that uses a memory-rule based approach to learn a regular grammar from a set of positive example sequences. STORM's learning algorithm uses a derivation of functional-equivalence theory that allows the model to learn via similarity of behavior, rather than just similar of form. This novel functional generalization ability allows STORM to learn a perfect and stable representation of the Reber grammar from a sparse training set of just 30 sequences, as opposed to the 60,000 sequences required to train a supervised connectionist model. Unlike supervised models, once STORM has learnt the grammar it can generalize to test sequences of any length or depth of embedding.

Extensions to the model are proposed to show how STORM can learn context-free grammars. These extensions also solve the logical problem of language acquisition by recovering from overgeneralizations without the need for negative evidence.

Item Type: Thesis
Creators: McQueen, T.
Date: 2005
ISBN: 9781369314489
Identifiers:
Number
Type
PQ10183172
Other
Divisions: Schools > School of Science and Technology
Record created by: Linda Sullivan
Date Added: 18 Sep 2020 13:48
Last Modified: 27 Jul 2023 09:18
URI: https://irep.ntu.ac.uk/id/eprint/40822

Actions (login required)

Edit View Edit View

Statistics

Views

Views per month over past year

Downloads

Downloads per month over past year