Problem

With newer C++ versions, we are given stronger and stronger capabilities in terms of compile time computations with constexpr features being in the forefront. One to me very interesting thing to bring into compile time is string parsing, which i believe can have many potential applications. Here are some examples of what I use it for.

  • Compile time string hashing - StringHash hashed = "identifier"_hash;
  • GLSL style vector swizzling - glm::vec2 v = SWIZZLE(xz, someVec3); - uses a macro for syntactic sugar but the xz part is actually a string parsed in compile time

With c++17 it is relatively easy to do compile time string parsing as just a standard constexpr function that operates on a char array and sometimes this is fine. However there is an annoying problem: if you want the string parsing to always resolve as a constant expression (i.e. it is always ever going to be in compile time, or affects what types are returned) and you want it to fail with a compiler error if the string is malformed, then this simple approach will not work. In this post I will explain why. I’ll also show another attempt that failed and at the end I will present a solution.

Scroll down directly to the solution section if you’re not interested in the lead-up.

For the purpose of this post, let’s assume we want to write a compile time string parsing function that takes a string which encodes a type and a single digit number. The input "f:5" would return 5 as a float and "i:5" would return 5 as an int. If it’s formatted incorrectly we want a compile time error.

First attempt - What is wrong with a constexpr function?

The most naive way is to do something like the following.

template <size_t Length>
constexpr auto parseNumber(const char (&text)[Length])
{
    if(text[0] == 'i')
    {
        //we know we have an int. convert from second position
        return strToInt(text + 2);  //assume that we have made constexpr helper functions for parsing ints from strings
    }
    else if(text[0] == 'f')
    {
        //we know we have a float. convert from second position
        return strToFloat(text + 2); //assume that we have made constexpr helper functions for parsing floats from strings
    }
    else
    {
        //error???
        return 0;
    }
}

But this will give an error straight away since we deduce two different return types based on normal if statements. We would need if constexpr here but that is not possible since the text parameter is not constexpr - function arguments are never constexpr. This means that if constexpr cannot be used. Furthermore, how would we raise a compile error in the final branch? We can’t use static_assert on text for the same reasons that we can’t use if constexpr. Hm.

Clearly we need a way to get the string into the function as a constant expression.

Template arguments to the rescue?

Function arguments are never constant expressions but template arguments are. Can we somehow use this to pass in a string?

template <const char* text>
void parseText()
{
    
}

Well this compiles! Let’s try to use it:

parseText<"test">();

error: '"test"' is not a valid template argument for type 'const char*' because string literals can never be used in this context

Damn, so string literals are just simply forbidden within template brackets.

What about if we use another type to wrap the string? Like a ConstantString type which is constexpr and then pass it as a template<ConstantString text>? Nope, at this point in time, C++ doesn’t allow custom types as non-type template parameters… This seems like another dead end.

Solution

I first thought this was the end of it and that we can’t do better as of C++17, but then I found out that there is actually a way to pass a string literal as a constexpr value into a function! It’s not the prettiest way, but it works.

(I did not invent this technique, I found it online)

The trick is to utilise a constexpr lambda which is a C++17 feature that says that any lambda can have the constexpr identifier which makes it work exactly like a standard constexpr function. Additionally, any lambda is by default a constexpr lambda unless it uses things that makes it non-constexpr. How does this help? Consider the following.

template <typename StringHolder>
constexpr void parseText(StringHolder holder)
{
    constexpr std::string_view text = holder();

    static_assert(text[0] == 'i' || text[0] == 'f');
}

void f()
{
    parseText([](){return "i:5";});
}

In this example, parseText takes a templated function argument which is a functor that returns the text as a constant expression. At the call site inside of f() we pass a lambda which simply returns the string we want to pass. Since the lambda is constexpr, we can store the result in the constexpr std::string_view text variable and from there on, we can operate on it as a constant expression, which brings full compile time parsing to our hands. It unlocks usage of if constexpr, varying the return value of the function, constructing new types based on template params and so on, all based on the data contained in the string. Quite powerful.

The biggest drawback with this approach is that the syntax when calling the parseText function is not very pretty… It can be improved using a small helper macro, which is not perfect for sure but it’s as far as I know the best we can do as of c++17.

#define PARSE_TEXT(text) \
parseText([](){return text;})

void f()
{
    PARSE_TEXT("i:5");
}

So with this approach, our intended function could look something like the following.

#include <type_traits>
#include <string_view>

template <typename StringHolder>
constexpr auto parseNumber(StringHolder holder)
{
    constexpr std::string_view text = holder();

    //we only support strings of length 3
    static_assert(text.size() == 3, "invalid length of input");

    constexpr char typeChar = text[0];
    constexpr char numberChar = text[2];

    //type char must be i or f
    static_assert(typeChar == 'i' || typeChar == 'f', "must start with 'i' or 'f'");
    //must be a colon as second character
    static_assert(text[1] == ':', "lacks proper ':' delimeter");

    //number char must have 0-9 as their ascii value
    static_assert(numberChar >= '0' && numberChar <= '9', "number part is not a valid number");

    int resultingNumber = numberChar - '0';
    if constexpr(typeChar == 'i')
        return resultingNumber;
    else
        return static_cast<float>(resultingNumber);
}

#define PARSE_NUMBER(text) \
parseNumber([](){return text;})

void usage()
{
    auto intResult = PARSE_NUMBER("i:2");
    auto floatResult = PARSE_NUMBER("f:5");

    static_assert(std::is_same_v<decltype(intResult), int>);
    static_assert(std::is_same_v<decltype(floatResult), float>);
}

This will give compile errors inside fo the parseNumber function if we pass malformatted strings with help of the static_asserts.

If you have managed to solve this in another way, or if you have remarks on the way it is solved here or similar, please let me know in the comments below. Thanks for reading!