Large Scale Ensemble Knowledge Transfer in Deep Metric Learning for Face Recognition
Principal Supervisor: Prof. Neil M. Robertson
Second Supervisor: Dr Jesus Martinez del Rincon
+ Project Description
Knowledge distillation into a smaller Neural Network from the posterior of an ensemble of large networks holds a lot of promise in democratising the power of deep neural networks by allowing them to run on low power devices like martphones without losing accuracy. As shown by Hinton  the posterior probability density function of a neural network contain a lot of meta information that is not present in hard targets. For classification this knowledge can be exploited by raising the temperature of the softmax function, which makes the posteriors a bit more diffuse. When these diffused posteriors from multiple networks are averaged, that combined diffused posterior can be used to train one network that has the properties of the ensemble. In practice that one network can be smaller than any of the original networks.
Although it is a very well defined problem for classification, it is unclear how we can distil the knowledge in metric learning. Even if all networks are learning the same dimensional embedding, each might find a different spanning basis for the space, or even map into different regions of a basis space. To overcome this we propose that we learn a unified metric that can also be used to generate human faces when sampled from it. The motivation for this approach is derived from  where they show that human faces can be represented in a high dimensional parametric space. This parametric space not only gives us identity but also factors across many properties such as gender, ethnicity, facial additions (beards, hair changes, glasses) and expression.
One of the main research problems to be solved in this project is finding the basis of this space that is discriminative for all the properties. Then all the networks of an ensemble can learn a mapping to this space. Once that mapping is learned, the research problem will be, to transfer the knowledge from this space using a fitted soft distribution into the output of the ensembles and then training a small network that can be embedded on smartphones, or smart cameras.
Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv preprint arXiv:1503.02531 (2015).
 Blanz, Volker, and Thomas Vetter. "A morphable model for the synthesis of 3D faces." Proceedings of the 26th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 1999.
+ Project Milestones
The main objectives for the PhD will be:
1. Show that a smaller network can have the same accuracy as an ensemble of large networks.
2. Solve the metric learning problem.
3. Demonstrate resilience to pose and illumination changes for large-scale face recognition.
4. Integrate the tool for interacting with the network using natural language (“show me an older version of my face, smiling with glasses”).
In particular the outputs of the PhD will aim to:
1. Speed up standard face search by a significant factor.
2. Produce more accurate learning/recognition, plus additional information (pose, ethnicity, expression, age) takes product to next level (much better than simple identity matching) enabling intelligent search and reconstruction of faces
3. Enable natural language search over all the learned parameters.
+ How to Apply
Applicants should apply electronically through the Queen’s online application portal at: https://dap.qub.ac.uk/portal/
+ Contact Details
|Supervisor Name:||Dr. Ivor Spence|
Queens University of Belfast