Skip to main content

Memory-aware optimization of task data-flow parallel programs

DEL PhD Studentship commencing October 2012

Principal Supervisor : Dr. Hans Vandierendonck

Project Description:

The programming of parallel and distributed computers, ranging from embedded multi-cores to top-500 HPC systems, is situated at a relatively low level of abstraction. In practice, the programmer must not only identify and describe the parallelism that exists in the program but (s)he must also re-arrange the parallelization and redefine data structures to optimize the program for a particular processor and system architecture.

The task data-flow execution model is a recent approach that aims to simplify parallel programming. It enables dynamic expression of parallelism in the sense that the parallelism is extracted during execution, as opposed to models where parallelism is described statically in the program text. A common approach to task data-flow execution requires that the programmer specifies the memory footprint of each task in the program: what memory locations it will access and whether that access will be read and/or write. The runtime system then infers the dependencies between tasks based on the order by which they are launched and on the overlap of their memory footprints.

It has been demonstrated that the task data-flow model has benefits in several areas, such as simplifying the expression of complex patterns of parallelism, simplifying the programming of heterogeneous systems (e.g. utilizing GPUs), etc. Furthermore, annotating the memory footprints of tasks has important potential for further advanced optimization of the runtime scheduler, especially with respect to the interaction between scheduling and the performance of the memory system.

The goal of this project is to investigate the use of the task data-flow model to automatically optimize performance, in particular by optimizing memory locality and memory layout. Potential optimization targets include the efficient use of cache memories, non-uniform memory architectures (NUMA), new technologies such as non-volatile memory, distributed memory systems and partitioned global address space (PGAs) systems. The long-term effect of this project is to further simplify parallel programming, which is becoming an important problem due to the prevalence of multi-core processors.


  • Investigate and develop techniques for runtime optimization of memory locality in task data-flow execution.
  • Investigate and develop techniques for locality-aware scheduling of tasks in parallel programs.
  • Explore parallel applications and quantify the impact of the proposed techniques.

Academic Requirements:

A minimum 2.1 honours degree or equivalent in Electrical and Electronic Engineering or relevant degree is required.

General Information

This 3 year PhD studentship, funded by the Department for Employment and Learning (DEL), commences on 1 October 2012, and covers approved tuition fees and a maintenance grant of approximately £13,590.

Applicants should apply electronically through the Queen's online application portal at:

Deadline for submission of applications is 28th March 2012

Further information available at:

Contact details:

Supervisor Name: Dr. Hans Vandierendonck