Runtime Scheduling Techniques for High-Performance Analytics
Principal Supervisor: Dr. Hans Vandierendonck
Second Supervisor: Dr. Peter Kilpatrick
+ Project Description
Data analytics are an important class of applications that, due to the scale of the data sets operated on, require distributed execution in data centres. Numerous frameworks to support the execution of data analytics workloads have been proposed in recent years, including Hadoop, Spark, Storm, Hive, etc. The goal of these frameworks is to enable high-performance execution while providing a programming environment that does not require expertise in high-performance computing. Several studies have demonstrated, however, that these environments do not provide the performance that is desirable and fall short by orders of magnitude of the performance of hand-tuned implementations. This project will investigate the design and implementation of frameworks for data analytics, in particular the runtime system support and data management techniques for these frameworks.
The objectives of this project are:
- To investigate the performance and scalability of state-of-the-art data analytics frameworks on a range of applications across text-based and numeric algorithms as well as compute- and data-intensive algorithms.
- To develop a task-parallel programming language and its runtime system to enable efficient and scalable execution on data centres.
- To represent key data analytics algorithms in the task-parallel programming language.
- To study data locality and load balancing in the context of the task-parallel programming language and to develop novel scheduling techniques optimize locality and load balancing.
+ How to Apply
Applicants should apply electronically through the Queen’s online application portal at: https://dap.qub.ac.uk/portal/
+ Contact Details
|Supervisor Name:||Dr. Hans Vandierendonck|
Queens University of Belfast
+44 (0)28 9097 4654