School of Electronics, Electrical Engineering and Computer Science
& ECIT Global Research Institute
Proposed Project Title: Mining Graphlet Statistics From Online Social Networks
Principal Supervisor: Hans Vandierendonck
Second Supervisor: Cheng Long
Graphlet statistics (e.g., wedges, triangles, k-node patterns, etc.) of social networks reflect core knowledge about the networks which could be used in many ways and by many parties (e.g., marketing companies, government authorities, social scientists, etc.). For example, UK government authority might be interested to know how many pairs of Facebook friends with one from UK and the other from China since the number would reflect some strength of the links between UK citizens and Chinese citizens.
Mining graphlet statistics of real social networks is challenging since they are massive in size. What makes the problem even harder is that real social networks are not fully accessible from users who are not the owners but only accessible via APIs with limited rates. This project is to employ sampling techniques (e.g., random walks) to collect a portion of a social network (e.g., Twitter) and then based on the sample, to construct some estimations of graphlet statistics by using some statistics techniques (e.g., Horvitz-Thompson estimator).
The key of an approach for this project is to use APIs economically to sample as many patterns (e.g., triangles, k-node patterns) as possible.
Supervisor Name: Hans Vandierendonck Tel: +44 (0)28 9097
QUB Address: Computer Science Building, 18 Malone Road, Belfast BT9 5BN Email: firstname.lastname@example.org