Hybridizer HOW TO — Intrinsics

It’s often useful to use intrinsics or builtin functions provided by CUDA. You might also already have a very optimized cuda header which you’d like to reuse from your C# application. This can be done using IntrinsicFunction attribute. When generating the source code, the function call is replaced by the IntrinsicFunction name, and the contents of that function is ignored.

Intrinsics functions

Consider this code sample:

class IntrinsicFunction
    {
        [IntrinsicFunction("printf")]
        public static void printf(string format, double val)
        {
            Console.WriteLine(val);
        }

        [IntrinsicFunction("erf")]
        private static double Erf(double x)
        {  
            double ax = x > 0.0 ? x : -x;
            const double a1 = 0.254829592;
            const double a2 = -0.284496736;
            const double a3 = 1.421413741;
            const double a4 = -1.453152027;
            const double a5 = 1.061405429;
            const double p = 0.3275911;
            double t = 1.0 / (1.0 + p * x);
            double y = 1.0 - (((((a5 * t + a4) * t) + a3) * t + a2) * t + a1) * t * Math.Exp(-ax * ax);

            return x > 0.0 ? y : -y;
        }

        [EntryPoint]
        public static void run()
        {
            printf("%.17lf\n", Erf(1.0));
        }

        public static void Run()
        {
            Console.WriteLine("IntrinsicFunction :: ");
            Console.WriteLine(":: C# :: ");
            run();

            HybRunner runner = HybRunner.Cuda("ConsoleApplication96_CUDA.vs2015.dll").SetDistrib(1, 1);
            dynamic wrapped = runner.Wrap(new IntrinsicFunction());

            Console.WriteLine(":: CUDA :: ");
            cuda.DeviceSynchronize();
            wrapped.run();
        }
    }

The C# implementation of Erf is too simple to be bug free and accurate. It’s better to rely on the CUDA implementation.
To do that, we decorate the C# function with the IntrinsicFunction attribute. Hybridizer will get the name property of that attribute, and replace calls to Erf by calls to the native erf function from cmath.

Intrinsics Type

Sometimes, we already have a good native CUDA implementation of some function. In that case we don’t want Hybridizer to process our C# symbol. We rather want it to use the one we provide in a custom header.

[IntrinsicInclude("myheader.cuh")]
    class IntrinsicType
    {
        [IntrinsicFunction("myfunction")]
        private static double myfunction(double x)
        {
            return 42.0;
        }

        [IntrinsicFunction("printf")]
        public static void printf(string format, double val)
        {
            Console.WriteLine(val);
        }

        [EntryPoint]
        public static void run()
        {
            printf("%.17lf\n", myfunction(3.0));
        }

        public static void Run()
        {
            Console.WriteLine("IntrinsicType :: ");
            Console.WriteLine(":: C# :: ");
            run();

            HybRunner runner = HybRunner.Cuda("ConsoleApplication96_CUDA.vs2015.dll").SetDistrib(1, 1);
            dynamic wrapped = runner.Wrap(new IntrinsicType());

            Console.WriteLine(":: CUDA :: ");
            cuda.DeviceSynchronize();
            wrapped.run();
        }
    }

The IntrinsicInclude attribute on the class will tell the Hybridizer to include this header in the generated file.
We therefore write a custom header:
#pragma once
__device__ inline double myfunction(double x) {
	return x * x + 2.0;
}

The C# version of myfunction won’t be processed and myheader.cuh will be included.
With Hybridizer Software Suite, no further modification is needed. However Hybridizer Essentials relies on nvrtc which doesn’t include headers automatically. You then need to provide the path of headers to Hybridizer, using “Additional JITTER Headers” options:
Additional Jitter headers

Results

You can download the above code from this archive. Running it will show that C# code have been properly replaced by native functions:

intrinsics calls


Tags: ,