Type Deduction and My Reference Mistakes

I had read Scott Meyers’ three previous books in the Effective series 1 2 3, before I began to read his new Effective Modern C++ 4 at Safari Books Online. I always expect to learn from Scott, but it surprised me how fast it could be. After reading only a few items about type deduction, I found that I implemented apply and compose (pipeline) incorrectly in my first WordPress blog 5. What a shame! But I thought I was fortunate to find the problem so soon, and I would like to record my mistakes and solutions here.

(A side note: As WordPress does not allow uploading C++ source files, I have put all the test code in a zip file. I will quote the relevant source code in the blog directly for an easy read, but you are welcome to download and try the test code yourself!)

First, my implementation of apply had the wrong return type, as shown in the following program (test1.cpp):

#include <iostream>

using namespace std;

#define PRINT_AND_TEST(x)    \
    cout << #x << ": ";      \
    test(x);                 \
    cout << endl;

template <typename Tp>
auto apply(Tp&& data)
{
    return forward<Tp>(data);
}

template <typename Tp, typename Fn, typename... Fargs>
auto apply(Tp&& data, Fn fn, Fargs... args)
{
    return apply(fn(forward<Tp>(data)), args...);
}

struct Obj {
    int value;
    explicit Obj(int n) : value(n)
    {
    }
    Obj(const Obj& rhs) : value(rhs.value)
    {
    }
    Obj(Obj&& rhs) : value(rhs.value)
    {
        rhs.value = -1;
    }
    ~Obj()
    {
    }
};

void test(Obj& x)
{
    cout << "Obj&:" << x.value;
}

void test(Obj&& x)
{
    cout << "Obj&&:" << x.value;
}

void test(const Obj& x)
{
    cout << "const Obj&:" << x.value;
}

int main()
{
    Obj obj(0);
    Obj& nref = obj;
    const Obj& cref = obj;

    PRINT_AND_TEST(obj);
    PRINT_AND_TEST(nref);
    PRINT_AND_TEST(cref);
    PRINT_AND_TEST(Obj(1));
    cout << endl;

    PRINT_AND_TEST(apply(obj));
    PRINT_AND_TEST(apply(nref));
    PRINT_AND_TEST(apply(cref));
    PRINT_AND_TEST(apply(Obj(1)));
    cout << endl;
}

It gives the following output:

obj: Obj&:0
nref: Obj&:0
cref: const Obj&:0
Obj(1): Obj&&:1

apply(obj): Obj&&:0
apply(nref): Obj&&:0
apply(cref): Obj&&:0
apply(Obj(1)): Obj&&:1

Apparently I did not make the types correct. The reason was my ignorant use of auto as the return type, without realizing that it always results in a non-reference type. C++14 provides a special decltype(auto) syntax, which keeps the reference-ness, and it seems to work here. The ‘fixed’ code is as follows (test2.cpp):

template <typename Tp>
decltype(auto) apply(Tp&& data)
{
    return forward<Tp>(data);
}

template <typename Tp, typename Fn, typename... Fargs>
decltype(auto) apply(Tp&& data, Fn fn, Fargs... args)
{
    return apply(fn(forward<Tp>(data)), args...);
}

After that, the program can output the correct result:

obj: Obj&:0
nref: Obj&:0
cref: const Obj&:0
Obj(1): Obj&&:1

apply(obj): Obj&:0
apply(nref): Obj&:0
apply(cref): const Obj&:0
apply(Obj(1)): Obj&&:1

Wait—is the code really correct?

Actually a little more testing reveals a bigger problem, which the original code did not exhibit. Here is the additional test code (test3.cpp):

…
Obj clone(Obj x)
{
    cout << "clone(Obj):" << x.value << " => ";
    return x;
}

int main()
{
    …
    PRINT_AND_TEST(clone(obj));
    PRINT_AND_TEST(clone(nref));
    PRINT_AND_TEST(clone(cref));
    PRINT_AND_TEST(clone(Obj(2)));
    cout << endl;

    PRINT_AND_TEST(apply(obj, clone));
    PRINT_AND_TEST(apply(nref, clone));
    PRINT_AND_TEST(apply(cref, clone));
    PRINT_AND_TEST(apply(Obj(2), clone));
    cout << endl;
}

And its horrendous output:

…
clone(obj): clone(Obj):0 => Obj&&:0
clone(nref): clone(Obj):0 => Obj&&:0
clone(cref): clone(Obj):0 => Obj&&:0
clone(Obj(2)): clone(Obj):2 => Obj&&:2

apply(obj, clone): clone(Obj):0 => Obj&&:1875662080
apply(nref, clone): clone(Obj):0 => Obj&&:1875662080
apply(cref, clone): clone(Obj):0 => Obj&&:1875662080
apply(Obj(2), clone): clone(Obj):2 => Obj&&:1875662080

Let us go back and analyse the case. Since all four functions have problems, we’ll just check the first one.

The call apply(obj, clone) makes the template parameter Tp be deduced to Obj&, as obj is an lvalue. This instantiation of apply(Obj&, Fn) contains the following function body, after all arguments are passed in:

    return apply(clone(forward<Obj&>(obj)));

Please notice that clone returns a temporary object, and its lifetime ends after return statement. As clone returns an rvalue, Tp for the next apply call is deduced to Obj. The function is instantiated as follows (check out the definition of std::forward if you are not familiar with it 6):

Obj&& apply(Obj&& data)
{
    return forward<Obj>(data);
}

Although the cloned object is destroyed after apply(Obj&, Fn) returns, its rvalue reference is still returned. Oops!

Seeing the reason, I only need to make sure an object type is returned when this apply is called. I experimented with the enable_if template 7, but it turns out that the fix is simpler than I expected (test4.cpp):

template <typename Tp>
Tp apply(Tp&& data)
{
    return forward<Tp>(data);
}

template <typename Tp, typename Fn, typename... Fargs>
decltype(auto) apply(Tp&& data, Fn fn, Fargs... args)
{
    return apply(fn(forward<Tp>(data)), args...);
}

I just have to change the first decltype(auto) to Tp to take advantage of the type deduction rules of ‘universal references’. While it is explained most clearly in Item 1 of Scott Meyers’ Effect Modern C++, his online article already has it clearly 8:

During type deduction for a template parameter that is a universal reference, lvalues and rvalues of the same type are deduced to have slightly different types. In particular, lvalues of type T are deduced to be of type T& (i.e., lvalue reference to T), while rvalues of type T are deduced to be simply of type T. (Note that while lvalues are deduced to be lvalue references, rvalues are not deduced to be rvalue references!)

Therefore:

  • If an lvalue is passed to apply, Tp (and the return type) is deduced to an lvalue reference (say, Obj&). This is what we expect to reduce copying.
  • If an rvalue is passed to apply, Tp (and the return type) is deduced to the object type (say, Obj). Now the returned object is move-constructed—what we would like to see.

I then also checked compose. At first, I tested with these lines added to the end of main (after adding the definition of compose, of course; see test5.cpp):

    auto const op1 = compose<Obj&&>(apply<Obj&&>);
    PRINT_AND_TEST(op1(Obj(3)));
    auto const op2 = compose<const Obj&>(apply<const Obj&>);
    PRINT_AND_TEST(op2(Obj(3)));

Both output lines contain ‘test(Obj&&):3’. The second line is obviously wrong, and it is exactly like the apply case, so the solution is similar too. I should not have relied on the wrong auto return type deduction—using decltype(auto) would fix it. The changed compose is like the following (test6.cpp):

template <typename Tp>
auto compose()
{
    return apply<Tp>;
}

template <typename Tp, typename Fn, typename... Fargs>
auto compose(Fn fn, Fargs... args)
{
    return [=](Tp&& x) -> decltype(auto)
    {
        return fn(compose<Tp>(args...)(forward<Tp>(x)));
    };
}

However, when I test code like below, it will not even compile (test7.cpp):

    auto const op_nr = compose<Obj>(clone);
    test(op_nr(obj));

Clang reports errors:

test7.cpp:106:10: error: no matching function for call to object of type 'const (lambda at test.cpp:26:12)'
    test(op_nr(obj));
         ^~~~~
test7.cpp:31:12: note: candidate function not viable: no known conversion from 'Obj' to 'Obj &&' for 1st argument
    return [=](Tp&& x) -> decltype(auto)
           ^
1 error generated.

!@#$%^…

Actually I can ‘fix’ the problem like this:

    auto const op_klvr = compose<const Obj&>(clone);
    test(op_klvr(obj));

Or this:

    auto const op_rvr  = compose<Obj&&>(clone);
    test(op_rvr(Obj()));

The two forms above would work actually quite well, if one knew the argument type and was careful. However, there would be a difference if one was careless, as could be shown by the revised test program with tracking information of the objects’ lifetime (test8.cpp). Only the relevant changes are shown below:

…
#define PRINT_AND_TEST(x)           \
    cout << " " << #x << ":\n  ";   \
    test(x);                        \
    cout << endl;
…
struct Obj {
    int value;
    explicit Obj(int n) : value(n)
    {
        cout << "Obj(){" << value << "} ";
    }
    Obj(const Obj& rhs) : value(rhs.value)
    {
        cout << "Obj(const Obj&){" << value << "} ";
    }
    Obj(Obj&& rhs) : value(rhs.value)
    {
        rhs.value = -1;
        cout << "Obj(Obj&&){" << value << "} ";
    }
    ~Obj()
    {
        cout << "~Obj(){" << value << "} ";
    }
};

void test(Obj& x)
{
    cout << "=> Obj&:" << x.value << "\n  ";
}

void test(Obj&& x)
{
    cout << "=> Obj&&:" << x.value << "\n  ";
}

void test(const Obj& x)
{
    cout << "=> const Obj&:" << x.value << "\n  ";
}

Obj clone(Obj x)
{
    cout << "=> clone(Obj):" << x.value << "\n  ";
    return x;
}

int main()
{
    Obj obj(0);
    Obj& nref = obj;
    const Obj& cref = obj;
    cout << endl;
    …
    auto const op_klvr = compose<const Obj&>(clone);
    auto const op_rvr  = compose<Obj&&>(clone);
    auto const op_nr   = compose<Obj>(clone);
    PRINT_AND_TEST(op_klvr(obj));
    PRINT_AND_TEST(op_klvr(Obj(3)));
    PRINT_AND_TEST(op_rvr(Obj(3)));
    PRINT_AND_TEST(op_nr(Obj(3)));
    //PRINT_AND_TEST(op_nr(obj));
}

The commented-out line cannot compile yet. The rest works fine, and the program will generate the following output (edited):

Obj(){0}
 obj:
  => Obj&:0
 …

 clone(obj):
  Obj(const Obj&){0} => clone(Obj):0
  Obj(Obj&&){0} => Obj&&:0
  ~Obj(){0} ~Obj(){-1}
 clone(nref):
  Obj(const Obj&){0} => clone(Obj):0
  Obj(Obj&&){0} => Obj&&:0
  ~Obj(){0} ~Obj(){-1}
 clone(cref):
  Obj(const Obj&){0} => clone(Obj):0
  Obj(Obj&&){0} => Obj&&:0
  ~Obj(){0} ~Obj(){-1}
 clone(Obj(2)):
  Obj(){2} => clone(Obj):2
  Obj(Obj&&){2} => Obj&&:2
  ~Obj(){2} ~Obj(){-1}

 apply(obj, clone):
  Obj(const Obj&){0} => clone(Obj):0
  Obj(Obj&&){0} Obj(Obj&&){0} ~Obj(){-1} ~Obj(){-1} => Obj&&:0
  ~Obj(){0}
 …
 apply(Obj(2), clone):
  Obj(){2} Obj(Obj&&){2} => clone(Obj):2
  Obj(Obj&&){2} Obj(Obj&&){2} ~Obj(){-1} ~Obj(){-1} => Obj&&:2
  ~Obj(){2} ~Obj(){-1}

 op_klvr(obj):
  Obj(const Obj&){0} => clone(Obj):0
  Obj(Obj&&){0} ~Obj(){-1} => Obj&&:0
  ~Obj(){0}
 op_klvr(Obj(3)):
  Obj(){3} Obj(const Obj&){3} => clone(Obj):3
  Obj(Obj&&){3} ~Obj(){-1} => Obj&&:3
  ~Obj(){3} ~Obj(){3}
 op_rvr(Obj(3)):
  Obj(){3} Obj(Obj&&){3} => clone(Obj):3
  Obj(Obj&&){3} ~Obj(){-1} => Obj&&:3
  ~Obj(){3} ~Obj(){-1}
 op_nr(Obj(3)):
  Obj(){3} Obj(Obj&&){3} => clone(Obj):3
  Obj(Obj&&){3} ~Obj(){-1} => Obj&&:3
  ~Obj(){3} ~Obj(){-1}
~Obj(){0}

We can see that apply(Obj(2), clone) generates one more move-construction than clone(Obj(2)), and this is expected from our implementation. We can also see the differences in the last group, where the use of op_klvr, op_rvr, or op_nr can affect whether copy-construction or move-construction is used. I would like to make op_nr work on an lvalue too (so the last code line can be uncommented).

I have finally implemented a version of compose with tag dispatching 9, treating reference types and non-reference types differently. The strategy is as follows:

  • If template argument is a reference type, the old logic still applies.
  • If template argument is not a reference type, a temporary object will be constructed in the pass-by-value parameter (with either the copy constructor or move constructor), and its rvalue reference will be used to invoke the reference-branch logic. An extra move operation may result, so it is still better to use ‘compose’ if it is known that the argument will be an value. (Alas, a lambda is only a function, but not a type-deducing function template. Damn, Scott hit me right on the face, again, with his nice introduction of the C++14 generic lambdas in Effective Modern C++. It provides for a far nicer solution. I won’t change the content here, but the part about compose is now largely obsoleted. Check out ‘Generic Lambdas and the compose Function’ for an update.—I hate love you, Scott!)

The extra move overhead makes this solution less attractive, but it only adds to the choices. Also, it would be a little awkward if one was forced to type ‘compose’. So my final compose is here (test9.cpp):

template <typename Tp>
auto compose_ref()
{
    return apply<Tp>;
}

template <typename Tp, typename Fn, typename... Fargs>
auto compose_ref(Fn fn, Fargs... args)
{
    return [=](Tp&& x) -> decltype(auto)
    {
        return fn(compose_ref<Tp>(args...)(forward<Tp>(x)));
    };
}

template <typename Tp, typename... Fargs>
auto compose_impl(false_type, Fargs... args)
{
    return [=](Tp x) -> decltype(auto)
    {
        return compose_ref<Tp&&>(args...)(move(x));
    };
}

template <typename Tp, typename... Fargs>
auto compose_impl(true_type, Fargs... args)
{
    return compose_ref<Tp>(args...);
}

template <typename Tp, typename... Fargs>
auto compose(Fargs... args)
{
    return compose_impl<Tp>(
        typename is_reference<Tp>::type(), args...);
}

Lessons learnt:

  • C++ programmers should always read Scott’s books (at least 99.99% should).
  • Although type deduction is very helpful, one needs to understand its rules and what auto actually means in each case; otherwise it is easy to make (terrible) mistakes.

  1. Scott Meyers: Effective C++. Addison-Wesley, 3rd edition, 2005. 
  2. Scott Meyers: More Effective C++. Addison-Wesley, 1996. 
  3. Scott Meyers: Effective STL. Addison-Wesley, 2001. 
  4. Scott Meyers: Effective Modern C++. O’Reilly Media, 2014. 
  5. Yongwei Wu: Study Notes: Functional Programming with C++
  6. Thomas Becker: Rvalue References Explained, p. 8
  7. Cppreference.com: std::enable_if
  8. Scott Meyers: Universal References in C++11
  9. David Abrahams and Douglas Gregor: Generic Programming in C++: Techniques, section ‘Tag Dispatching’

2 thoughts on “Type Deduction and My Reference Mistakes

Leave a comment