Tuesday, December 10, 2013

A Short Introduction to Template Metaprogramming

Template metaprogramming is a technique that allows you to do a lot of work at compile-time rather than run-time. It's a good way to generate code that is both flexible and fast. In this post, I'm going to walk through a simple example.

Edit: The examples in this post are written in C++.

So lets say that you're implementing a vector class. It might look something like this:

template <typename T, unsigned int n>
class Vector
{
    public:
        Vector<T, n> operator+ (const Vector<T, n>& v) const;

        T data[n];

};

You want to implement vector addition. You first thought would probably be to do something like this:

template <typename T, unsigned int n>
Vector<T, n> Vector<T, n>::operator+ (const Vector<T, n>& v) const
{
    Vector<T, n> temp;

    for (int i = 0; i < n; i++)

        temp.data[i] = data[i] + v.data[i];

    return temp;

}

This gets the job done, but it isn't optimal. We have to check the variable i every iteration to make sure it's still less than n. In an ideal world, we'd do this:

template <typename T, unsigned int n>
Vector<T, n> Vector<T, n>::operator+ (const Vector<T, n>& v) const
{
    Vector<T, n> temp;

    temp.data[0] = data[0] + v.data[0];

    temp.data[1] = data[1] + v.data[1];
    temp.data[2] = data[2] + v.data[2];
    ...
    temp.data[n - 1] = data[n - 1] + v.data[n - 1];

    return temp;

}

Unfortunately, C++ is not cool enough to have a straight forward way of doing that, so we have to get creative. That's where templates come in! We can make the variable i a template parameter and rewrite this function as a recursive function.

template <typename T, unsigned int n, unsigned int i>
void AddVectors(Vector<T, n>& result, const Vector<T, n>& l, const Vector<T, n>& r)
{
    AddVectors<T, n, i - 1>(result, l, r);
    result.data[i] = l.data[i] + r.data[i];
}

template <typename T, unsigned int n>

void AddVectors<T, n, 0>(Vector<T, n>& result, const Vector<T, n>& l, const Vector<T, n>& r)
{
    result.data[0] = l.data[0] + r.data[0];
}

template <typename T, unsigned int n>

Vector<T, n> Vector<T, n>::operator+ (const Vector<T, n>& v) const
{
    Vector<T, n> temp;
    AddVectors<T, n, n - 1>(temp, *this, v);
    return temp;
}

There are still a couple of problems with this though. First, we're still not gaining performance; we've just traded condition checking for function calling. Second, C++ doesn't let you partially specialize functions. To solve the first problem, we use the inline keyword. This gets rid of the function calls and puts the code direction into the calling function. To get around the second, we can put the function in a template class like this:

template <typename T, unsigned int n, unsigned int i>
struct AddVectors
{
    static inline void Func(Vector<T, n>& result, const Vector<T, n>& l, const Vector<T, n>& r)
    {
        AddVectors<T, n, i - 1>::Func(result, l, r);
        result.data[i] = l.data[i] + r.data[i];
    }
};

template <typename T, unsigned int n>

struct AddVectors<T, n, 0>
{
    static inline void Func(Vector<T, n>& result, const Vector<T, n>& l, const Vector<T, n>& r)
    {
        result.data[0] = l.data[0] + r.data[0];
    }
};

template <typename T, unsigned int n>

Vector<T, n> Vector<T, n>::operator+ (const Vector<T, n>& v) const
{
    Vector<T, n> temp;
    AddVectors<T, n, n - 1>::Func(temp, *this, v);
    return temp;

}

There! Now you have a function that adds two vectors without a loop and without extra function calls! This is just a simple example, but the same technique can be used on larger, more complicated problems. However, I should mention a few caveats before you start going crazy using it everywhere.

First of all, compilers are pretty good at optimizing. In the simple example above, the compiler would probably get rid of the loop anyway. Metaprogramming is best saved for recurring problems that come up frequently in your program (vector math, for example). 
Second, the inline keyword isn't guaranteed to work. The C++ standard leaves it up to the compiler to decide what should or should not be inlined. The inline keyword is really just a request from the programmer. There are usually compiler-specific keywords to force the compiler to inline a function, but using them requires extra maintenance if you ever want to support a different compiler. Lastly, if it wasn't already apparent, code like this is much longer and much harder to follow than normal code. It can also significantly increase compile times. If you're working with other people, the performance gains might not be worth the confusion.

All that said, there are definitely real and useful packages out there that use template metaprogramming. Both boost and Eigen rely on it heavily. I have a lot more to say on the topic, but I'll save it for later posts.

No comments:

Post a Comment