Peering inside the ‘black box’: Understanding and refining deep neural networks with cognitive science

  • Peering inside the ‘black box’: Understanding and refining deep neural networks with cognitive science

School of Electronics, Electrical Engineering and Computer Science
& ECIT Global Research Institute

Proposed Project Title: 
Peering inside the ‘black box’: Understanding and refining deep neural networks with cognitive science

Principal Supervisor:   Dr Barry Devereux
Second Supervisor: Dr Jesus Martinez del Rincon
Third Supervisor: Dr Brian Murphy

Project Description:

Large-scale deep and recurrent neural networks are today the dominant modelling framework for a variety of cognitive-level machine learning problem domains. They achieve close to human-level performance on many tasks, including visual object recognition, speech recognition and machine translation. However, an infamous problem with large-scale neural network models is that they are “black boxes”, with the functional organization of such networks – how they represent and compute task-relevant information – often being obscure to the researchers who trained them. This problem has motivated recent work which investigates techniques for visualizing and understanding the kinds of information neural network models can represent, and how they process that information (e.g. Karpathy et al 2015; Zeiler & Fergus, 2014; Strobelt et al 2016). We can think of these approaches as a kind of “artificial cognitive neuroscience” – statistical techniques that are similar to those which have been developed to analyse human brain imaging data (e.g. Kriegeskorte & Kievet, 2015) can be used to investigate how information is represented across distributed patterns of activation in an artificial neural network. After gaining insights about how neural networks build cognitive-level representations in this way, we can in turn use those insights to guide and refine our engineering of neural network architectures.

This project will adopt language modelling with recurrent neural networks as the target problem domain, since there are well-developed linguistic and cognitive theories about language processing which we can relate to the functional properties of the network models. The project will be organised around two complementary objectives. Firstly, building on recent work (Linzen et al., 2016, Bernardy & Lappin, 2017), we will investigate how the semantic and grammatical knowledge of sentences is encoded in neural network language models, by relating the multivariate activation patterns in the layers of the network to the known linguistic properties of those sentences. In this way, we will build a scientific understanding of how neural networks can encode and process linguistically-relevant information. Secondly, we will use publicly-available human neuroimaging data collected during language comprehension tasks (Hanke et al., 2014) to refine neural network language models. Specifically, we will use the distributed patterns of information in the neuroimaging data to regularize the organization of activation patterns in the models hidden layers (Wu et al., 2017), inducing a degree of functional correspondence between the artificial network and the brain data. By utilising both language corpus data and human neuroimaging data in training the language models in this way, our goal is to build more interpretable language models that show improved performance on a variety of real-world language tasks.   


Bernardy, J. P., & Lappin, S. (2017). Using Deep Neural Networks to Learn Syntactic Agreement. LiLT (Linguistic Issues in Language Technology)15.

Hanke, M., Baumgartner, F. J., Ibe, P., Kaule, F. R., Pollmann, S., Speck, O., ... & Stadler, J. (2014). A high-resolution 7-Tesla fMRI dataset from complex natural stimulation with an audio movie. Scientific data1, 140003.

Karpathy, A., Johnson, J., & Fei-Fei, L. (2015). Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078.

Linzen, T., Dupoux, E., & Goldberg, Y. (2016). Assessing the ability of LSTMs to learn syntax-sensitive dependencies. arXiv preprint arXiv:1611.01368.

Strobelt, H., Gehrmann, S., Huber, B., Pfister, H., & Rush, A. M. (2016). Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks. ArXiv:1606.07461 [Cs]. Retrieved from

Wu, C., Gales, M., Ragni, A., Karanasou, P., & Sim, K. C. (2017). Improving Interpretability and Regularisation in Deep Learning. IEEE/ACM Transactions on Audio, Speech, and Language Processing.

Contact details

Supervisor Name: Barry Devereux 
Tel: +44 (0)28 9097 1705
QUB Address: ECIT, Queen’s University, Queen’s Road, Belfast BT3 9DT