Poster GTC 2016 – Using CLANG/LLVM Vectorization to Generate Mixed Precision Source Code

At Supercomputing 2015, NVIDIA announced Jetson TX1. This platform is the first available to natively expose mixed precision instructions. However, this instruction set requires that operations on 16-bit precision floating points are done in pairs, requiring usage of the half2 type which pairs two values in a single register.

see it at GTC On-Demand — ID P6352