Hybridizer HOWTO — Generics

Virtual functions come with a significant performance penalty. In order to overcome this, we map generics to templates. The generated source code can then be inlined, and the flexibility of objects can still be used with performance.

Expm1 GFLOPS GCFLOPS usage
Local 975 538 92%
Dispatch 478 263 45%
peak 1174 587

Template concepts in C++ are not expressed, as compiler tells whether the type is compliant or not. In dot net, the concept is expressed by constraints on the generic type. The following example illustrated.


[HybridTemplateConcept]
public interface IMyArray {
    double this[int index] { get; set; }
}

[HybridRegisterTemplate(Specialize=typeof(MyAlgorithm<MyArray>))]
public struct MyArray : IMyArray
{
    double[] _data;
    [Kernel] public double this[int index] {
        get { return _data[index]; }
        set { _data[index] = value; }
    }
}

public class MyAlgorithm<T> where T : struct, IMyArray
{
    T a, b;
    [Kernel] public void Add(int n) {
        for (int k = threadIdx.x + blockDim.x * blockIdx.x;
            k < n; k += blockDim.x * gridDim.x)
            a[k] += b[k];
    }
}

Using this approach, we restore performances at a level very similar to performances we obtain without any polymorphism.

Expm1 GFLOPS GCFLOPS usage
Local 975 538 92%
Dispatch 478 263 45%
Generics 985 544 93%
peak 1174 587

Tags: