A24 - A Programmer's Guide for Modern High-Performance Computing

From: December 12 to 16, 2016

A24 – A Programmer's Guide for Modern High-Performance Computing Architectures

High-performance computing has seen a lot of interesting advances in the last decade, in both architectures and programming models. In terms of architectures, we have seen massive parallelism and various kinds of accelerators appearing with the promise of TeraFLOPs of performance. More and more application fields are tempted by the promise of many-fold performance gains, and willing to design and implement HPC solutions for their use-cases.

Despite the enthusiasm, programming these novel HPC architectures is hard work. In fact, achieving the promised level of performance often requires suitable workloads, algorithm re-writing, multilayered parallelization, multi-grain concurrency, and aggressive optimization and tuning. None of these challenges should be good reasons to give up, but reasons enough to get to know more about programming these novel HPC architectures, while understanding their strong points, their limitation, and ultimately their "down-to-Earth" performance.

This course - A Programmer's Guide for Modern High-Performance Computing Architectures - is designed to respond to this need of understanding and being able to program novel HPC architectures. Specifically, the course covers three types of HPC architectures – multi-core processors, GPU accelerators, and FPGA accelerators -, and the programming models and techniques used for them. The intended outcome of the course is an understanding of the issues and problems in using these types of processors, the ability to analyze and predict the performance to be expected from them in cases of real-applications, and practical knowledge in choosing the right machine and programming model for a given application.


The course is scheduled between December 12-16, 2016. It has 4 full days of combined theoretical lectures and hands-on lab-work. The daily schedule is split in two parts: morning lectures from 10:00 to 13:00, and afternoon practical work, from 14:00 to 17:00.


  • Mon. Dec. 12 10:00-14:00: Science Park - room C3.163
  • Tue. Dec. 13 10:00-17:00 Roeterseiland E Building, room E0.15
  • Wed. Dec. 14 10:00-17:00 Roeterseiland E Building, room E0.09
  • Thu. Dec. 15 10:00-13:00 Science Park room A1.30; 
  • Thu. Dec. 15 13:00-17:00 Science Park room D1.111
  • Fri. Dec. 16 10:00-14:00 Science Park room B1.25



Day 1: Introduction to HPC: parallel & distributed computing (Ana Lucia Varbanescu, UvA) I will present the core concepts of parallel and distributed processing.

I will further discuss the commong HPC architectures we use today, with examples from both the high-end and regular-user spectrum. I will also discuss the basic application classes and programming model, and introduce the first elements of performance measurement and analysis.

Day 2: GPU programming (Rob van Nieuwpoort, UvA, and Ben van Werkhoven, NLeSC) We will give a brief introduction to GPU hardware, and explain why they are so attractive. However, we will focus most on the programming of GPUs. We interleave theoretical material with hands-on work. We will use Cuda for the practical part, but will also explain the differences with OpenCL and OpenMP. A significant part of the day deals with memory-centric programming, explaining where bottlenecks are in GPUs, and how to get performance improvements by using the memory system effectively. We will give a couple of practical tips and tools to analyze performance.

Day 3: Large-scale processing and big data (Alexandru Iosup, Tim Hegeman, TUDelft) Google, Facebook, Amazon are all major tech companies that rely on scalable computer systems to survive. To cope with increasing computation demands and with a data deluge, they have already started to build complex hardware and software systems of systems (ecosystems). A similar trend is now seen in many other application domains, where a global community designs, implements, and accesses as cloud services.

These services have to address requirements of high performance or high throughput, and their users may switch at any time among the hundreds of service providers and technologies. This lecture focuses on interesting new challenges in the design and operation of infrastructure (IaaS) and platform (PaaS) cloud-services, in particular on supporting the dynamic workloads of demanding users, on ensuring various forms of scalability, and also on efficient and fair operation. You will learn here vital skills for IT: if we succeed in addressing these challenges, we may not only enable the advent of big science and engineering, and the almost complete automation of many large-scale processes, but also reduce the ecological footprint of datacentres and the entire ICT industry.

Day4: Multicore Programming                                                                                                             Multicore processors have become ubiquitous, from small-scale computing devices, such as laptops and smartphones, to large-scale multi-socket server systems combining dozens or even hundreds of cores with a cache-coherent shared memory. During this day we explain the ins and outs of programming such systems with OpenMP compiler directives. OpenMP is an industry standard supported by practically all C/C++ and Fortran compilers these days and forms the most accessible and effective approach to programming shared memory parallel systems. We end the day with a brief look into a research-level alternative to programming not only shared-memory systems but a multitude of parallel architectures in an architecture-agnostic style:

the functional array language Single Assignment C (SAC).

Day 5: Emerging HPC topics: technologies, programming models, applications, performance engineering. (Ana Lucia Varbanescu, UvA) We will end the course with a discussion of the things to come in HPC - from hardware programming to application-level design tools, from performance measurement to performance modeling. This day will not have any hands-on session, but will include a couple of demo's.



Ana Lucia Varbanescu
Universiteit van Amsterdam
Instituut voor Informatica
Science Park 904
1098 XH Amsterdam

Alexandru Iosup
Parallel and Distributed Systems Group
Mekelweg 4
2628CD Delft

Clemens Grelck
Universiteit van Amsterdam
Instituut voor Informatica
Science Park 904
1098 XH Amsterdam
+31 (0) 20 525 8683
+31 (0) 20 525 7490


 ASCI Tweets 


  • Currently no news...


Or check Newsletter for more...