Skip to main content

Big Data: Data Extraction, Search and Integration, Geo-Spatial Query Answering, Information Retrieval, and Mining

J Hong, X Cao, W Liu

Our research on data extraction, mining, search and integration is conducted with research problems that arise in three main application areas: World-Wide Web, social media, and enterprises (corporate Intranets). The overall goal of this research is to develop algorithms and tools that extract structured data from unstructured and semi-structured data sources, discover knowledge from these sources, retrieve data efficiently from these sources, integrate data from multiple data sources, and apply these algorithms and tools to solve application challenges in the areas of World-Wide Web, social media and enterprises.

Specific areas of interest include:

  • Deep web data extraction, crawling, search and integration
  • Text extraction, text and web mining
  • Conversational recommendation systems
  • Enterprise data extraction and semantic search
  • Social media mining, including sentiment analysis and opinion mining

This research has a variety of applications. The Web, social media and corporate Intranets contain massive amounts of data. Companies and organisations have a large number of pre-existing, autonomous and independently created data repositories. Scientists produce a large volume of scientific and experimental data. Computerisation of businesses, services, governments and commerce creates the big data problem. Data extraction, mining, search and integration lies at the heart of all these applications.

Recommender systems are a common way to promote products or services that may be of interest to a user, usually based on some profile of interests. The single-shot approach, which produces a ranked list of recommendations, is limited by design. It works well when a user’s needs are clear, but it is less suitable when a user’s needs are not well known, or where they are likely to evolve during the course of a session. In these scenarios it is more appropriate to engage the user in a recommendation dialogue so that incremental feedback can be used to refine recommendations. This type of conversational recommender system is much better suited to help users navigate more complex product spaces.

We are interested in improving the efficiency of recommender systems.



Please click here to download the Big and small dataset in zip format.