Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Tutorial [clear filter]
Thursday, October 29

2:00pm PDT

Building, Testing and Debugging a Simple out-of-tree LLVM Pass

This tutorial aims at providing solid ground to develop out-of-tree LLVM passes. It presents all the required building blocks, starting from scratch: cmake integration, llvm pass management, opt / clang integration. It presents the core IR concepts through two simple obfuscating passes: the SSA form, the CFG, PHI nodes, IRBuilder etc. We also take a quick tour on analysis integration through dominators. Finally, it showcases how to use cl and lit to parametrize and test the toy passes developed in the tutorial.


Serge Guelton


Adrien Guinet


Thursday October 29, 2015 2:00pm - 3:00pm PDT
Salon I & Salon II

3:00pm PDT

Creating an SPMD Vectorizer for OpenCL with LLVM

Processors such as CPUs or DSPs often feature SIMD instructions, but are not designed to efficiently support Single Program Multiple Data (SPMD) execution models such as OpenCL. The design of a compiler for such a target therefore needs some form of vectorization to generate the most optimal code for this kind of data-parallel execution model. This is because SPMD programs are most often written in scalar form with the implicit assumption that many instances of the program are executed in parallel. On CPU-like architectures, SIMD vector units can be leveraged for parallelism, such that each SIMD lane is loosely mapped to a program instance. 


This tutorial looks at how to create an SPMD vectorizer that targets CPU-like architectures for use with heterogeneous compute frameworks. OpenCL is used as an example but the concepts should translate to other frameworks such as CUDA, RenderScript or Vulkan Compute. While there are other possible approaches, we have chosen to present one that works at the LLVM IR level and that is essentially an IR pass that creates vectorized functions from the original scalar SPMD function. This allows targetting multiple architectures with very little architecture-specific code. 


We will start by briefly introducing the SPMD execution model, describing how it is used in OpenCL and giving an overview of what a SPMD vectorizer should do and how it differs from other kinds such as LLVM's loop vectorizer and SLP vectorizer. Then we will look at a possible vectorizer design, including the different vectorization stages (analysis, control-flow to data-flow, scalarization, packetization/instantiation and optimization/cleanup). Finally, we will look at some possible optimizations as well as other aspects that do not fit the 'stage-by-stage' presentation (e.g. vectorizing and scalarizing calls to builtin functions, SIMD width detection, interleaved memory access optimizations, SoA to AoS conversions, etc).


Pierre-Andre Saulais

Codeplay Software

Thursday October 29, 2015 3:00pm - 4:00pm PDT
Salon I & Salon II

4:00pm PDT

Polly - Optimistic Loop Nest Optimizations with Schedule Trees

Polly is an advanced LLVM loop nest optimizer that provides precise memory access analyses and implements on top of them advanced loop optimizations based on a memory-access focused program model.

In the first part of this tutorial we introduce the audience to  integer set based schedule trees as a way to model loop programs. We explain how we statically model program behavior on the granularity of individual dynamic computations and discuss different program analyses (memory accesses, data-dependences, computational complexity).

We then learn how to perform complex loop transformations using simple per-node operations on an abstract program schedule tree. Such transformations include most classical loop transformations, but also full/partial tile separation, outer-loop vectorization and other more complex transformations. At the end of the first part of this tutorial, the audience understands the general concepts used in Polly.

The second part of this tutorial is focused on Polly's new optimistic optimization infrastructure that enables non-statically provable transformations to be performed optimistically. Discussing optimization blocking issues such as exception handling code, infinite loops, integer wrapping or out-of-bound memory accesses we introduce the concept of optimistic assumptions. We then discuss how such assumptions can be described in general, how Polly can collect assumptions, how redundant assumptions are eliminated and how a (close to) minimal run-time check to verifying them are generated. At the end of the second part of this tutorial the audience will be able to create optimistic loop optimizations even for cases that lack sufficient static information.

avatar for Johannes Doerfert

Johannes Doerfert

Researcher/PhD Student, Saarland University

Tobias Grosser

ETH Zurich

Thursday October 29, 2015 4:00pm - 5:00pm PDT
Salon I & Salon II
Filter sessions
Apply filters to sessions.