Optimizing sluggish state-based neural networks for effective time-series processing

Orojo, O.O., 2021. Optimizing sluggish state-based neural networks for effective time-series processing. PhD, Nottingham Trent University.

[img]
Preview
Text
Oluwatamilore Orojo 2022.pdf - Published version

Download (2MB) | Preview

Abstract

The devastating personal and economic upheaval caused by the financial crises in 2007/2008 and more recently, the spread of Covid-19 from Feb 2020 till date (June 2021) strongly highlighted the need for effective time-series processing models that can provide useful insights and accurate forecasts in a timely manner to inform critical decision-making that affects lives and livelihoods. Thus, this research is focused on identifying an effective and suitable time-series approach that harnesses advantages of the current state-of-the-art forecasting models whilst mitigating their challenges. A critical review of existing state-of-the-art methods revealed the following two key attributes are required for effective time-series processing: a robust yet flexible memory mechanism and minimal computational complexity for modelling complex dynamic time-series. The Multi-recurrent Neural Network (MRN) was identified as the preferred model and subject to critical examination and enhancement due to its unique and powerful sluggish state-based memory mechanism that has largely gone unnoticed since its first introduction by Claudia Ulbricht from the University of Austria in 1994.

This thesis subsequently makes the following four meaningful contributions to the research field: a) the MRN was applied to different real-world temporal problems (where it had not previously been applied) (of varying complexity). It was then compared to current state-of-the-art forecasting methods, where it demonstrated superior performance. It was critically assessed to identify limitations and points of extension; b) the MRN's hidden layer was endowed with periodically attentive units to tackle two well-known issues affecting artificial neural networks; vanishing gradient problem and catastrophic interference. This innovation applied to the hidden layer encouraged the network to organise features according to different units of time. Therefore, reducing the information processing load placed on individual hidden units. Thus, alleviating the issue of catastrophic interference. In addition, the network was able to hold information for longer periods of time, as the unit partitions only responded at specific time intervals. This provided a means to mitigating the vanishing gradient problem, which in most instances led to better performance; c) the MRN was endowed with an innovative self-learning mechanism, to reduce user input and identify architectural hyper-parameters. This extension enabled the MRN to inform and enhance its internal memory composition (and thus quality) through incorporating Ratio Control Units to learn the layer-link ratios. This technique provided a new outlook on algorithm development, in particular pointing to the abilities of recurrent neural networks, and in particular, the MRN, to innately learn the importance of historical context rather than relying on hyper-parameters manually set by the user and d) a framework incorporating one of the proposed MRN innovations together with a one-shot pruning algorithm (based on the learnt ratio similarity) was proposed. The framework specifically provided a means of obtaining 'good' models by eliminating the need to train numerous models to exhaustively explore the search space for the optimum memory bank configuration. The new innovation simply requires one large over-parameterised MRN to be trained. More specifically, the pruning algorithm will automatically identify the optimum memory bank configuration in a robust manner which minimises the coupling of memory banks whilst maximising, or at least retaining, strong generalisation ability.

Finally, a critical discussion of the innovations proposed is given together with a number of insights and suggestions for further work. The key areas for improvements are i) employing different activation functions for the ratio learning, ii) extending the pruning algorithm to not only prune based on ratio similarity but on memory bank importance, iii) learning the self-link ratios rather than just the layer-link ratios iv) developing deep MRNs for more complex temporal problems and v) applying knowledge extraction techniques to understand the quality of the underlying state-based representations formed. The work of this thesis has been published in IEEE Symposium Series on Computational Intelligence, International Joint Conference on Neural Networks and the Applied Intelligence and Informatics Conference.

Item Type: Thesis
Creators: Orojo, O.O.
Date: June 2021
Rights: The copyright in this work is held by the author. You may copy up to 5% of this work for private study, or personal, non-commercial research. Any re-use of the information contained within this document should be fully referenced, quoting the author, title, university, degree level and pagination. Queries or requests for any other use, or if a more substantial copy is required, should be directed to the author.
Divisions: Schools > School of Science and Technology
Record created by: Linda Sullivan
Date Added: 24 Mar 2022 10:36
Last Modified: 24 Mar 2022 10:38
URI: https://irep.ntu.ac.uk/id/eprint/45956

Actions (login required)

Edit View Edit View

Views

Views per month over past year

Downloads

Downloads per month over past year