Python yield and C++ Coroutines

Back in 2008, an old friend challenged me with a programming puzzle, when we both attended a wedding. He later gave a solution in Python. Comparing with my C++ solution, the Python one has about half the code lines. I was not smart enough to begin learning Python then, but instead put an end to my little interest in Python with a presentation in C++ Conference China 2009. I compared the basic constructs, and concluded they were equivalent, not realizing that Python had more to offer than that trivial programming exercise showed.

Fast forwarding to today (2016), I am really writing some (small) programs in Python. I have begun to appreciate Python more and more. For many of the tasks, the performance loss in executing the code is ignorable, but the productivity boost is huge. I have also realized that there are constructs in Python that are not easily reproducible in other languages. Generator/yield is one of them.

The other day, I was writing a small program to sort hosts based on dot-reversed order so as to group the host names in a more reasonable order (regarding ‘www.wordpress.com’ as ‘com.wordpress.www’). I quickly came up with a solution in Python. The code is very short and I can show it right here:

def backsort(lines):
    result = {}
    for line in lines:
        result['.'.join(reversed(line.split('.')))] = line
    return map(lambda item: item[1],
               sorted(result.items()))

Of course, we can implement a similar function in C++11. We will immediately find that there are no standard implementations for split and join (see Appendix below for my implementation). Regardless, we can write some code like:

template <typename C>
vector<string> backsort(C&& lines)
{
    map<string, string> rmap;
    for (auto& line : lines) {
        auto split_line = split(line, '.');
        reverse(split_line.begin(), split_line.end());
        rmap[join(split_line, '.')] = line;
    }
    vector<string> result(rmap.size());
    transform(rmap.begin(), rmap.end(), result.begin(),
              [](const pair<string, string>& pr)
              {
                  return pr.second;
              });
    return result;
}

Even though it has twice the non-trivial lines of code and is a function template, there is immediately something Python can do readily but C++ cannot. I can give the Python file handle (like os.stdin) directly to backsort, and the for line will iterate through the file content.1 This is because the Python file object implements the iterator protocol over lines of text, but the C++ istream does not do anything similar.

Let us forget this C++ detail, and focus on the problem. My Python code accepts an iterator, and ‘backsorts’ all the input lines. Can we make it process multiple files (like the cat command line), without changing the backsort function?

Of course it can be done. There is a traditional way, and there is a smart way. The traditional way is write a class that implements the iterator protocol (which can be readily modelled by C++):

class cat:
    def __init__(self, files):
        self.files = files
        self.cur_file = None

    def __iter__(self):
        return self

    def next(self):
        while True:
            if self.cur_file:
                line = self.cur_file.readline()
                if line:
                    return line.rstrip('\n')
                self.cur_file.close()
            if self.files:
                self.cur_file = open(self.files[0])
                self.files = self.files[1:]
            else:
                raise StopIteration()

We can then cat files by the following lines:

if __name__ == '__main__':
    if sys.argv[1:]:
        for line in cat(sys.argv[1:]):
            print(line)

Using yield, we can reduce the 18 lines of code of cat to only 5:

def cat(files):
    for fn in files:
        with open(fn) as f:
            for line in f:
                yield line.rstrip('\n')

There is no more bookkeeping of the current file and the unprocessed files, and everything is wrapped in simple loops. Isn’t that amazing? I actually learnt about the concept before (in C#), but never used it in real code—perhaps because I was too much framed by existing code, using callbacks, observer pattern, and the like.—Those ‘patterns’ now look ugly, when compared to the simplicity of generators.

Here comes the real challenge for C++ developers: Can we do the same in C++? Can we do something better than inelegant callbacks? 2


My investigations so far indicate the following: No C++ standards (up to C++14) support such constructs, and there is no portable way to implement them as a library.

Are we doomed? No. Apart from standardization efforts regarding coroutines (which is the ancient name for a superset of generators, dated from 1958) in C++,3 there have been at least five cross-platform implementations for C++:

  • The unofficial Boost.Coroutine by Giovanni P. Deretta (2006), compatible with Windows, Linux, and maybe a few Unix variants (tested not working on OS X); apparently abandoned.4
  • The official Boost.Coroutine by Oliver Kowalke (2013), compatible with ARM, MIPS, PPC, SPARC, x86, and x86-64 architectures.
  • The official Boost.Coroutine2 by Oliver Kowalke (2015), compatible with the same hardware architectures but only C++ compilers/code conformant to the C++14 standard.
  • Mordor by Mozy (2010), compatible with Windows, Linux, and OS X, but seemingly no longer maintained.
  • CO2 by Jamboree (2015), supporting stackless coroutines only, using preprocessor tricks, and requiring C++14.

As Boost.Coroutine2 looks modern, is well-maintained, and is very much comparable to the Python constructs, I will use it in the rest of this article.5 It hides all the platform details with the help of Boost.Context. Now I can write code simply as follows for cat:

typedef boost::coroutines2::coroutine<const string&>
    coro_t;

void cat(coro_t::push_type& yield,
         int argc, char* argv[])
{
    for (int i = 1; i < argc; ++i) {
        ifstream ifs(argv[i]);
        for (;;) {
            string line;
            if (getline(ifs, line)) {
                yield(line);
            } else {
                break;
            }
        }
    }
}

int main(int argc, char* argv[])
{
    using namespace std::placeholders;
    for (auto& line : coro_t::pull_type(
             boost::coroutines2::fixedsize_stack(),
             bind(cat, _1, argc, argv))) {
        cout << line << endl;
    }
}

Is this simple and straightforward? The only thing that is not quite intuitive is the detail that the constructor of pull_type expects the second argument to be a function object that takes a push_type& as the only argument. That is why we need to use bind to generate it—a lambda expression being the other alternative.

I definitely believe being able to write coroutines is a big step forward to make C++ more expressive. I can foresee many tasks simplified, like recursive parsing. I believe this will prove very helpful in the C++ weaponry. I only wish we could see it standardized soon.

Appendix

The complete backsort code in Python:

#!/usr/bin/env python
#coding: utf-8

import sys

def cat(files):
    for fn in files:
        with open(fn) as f:
            for line in f:
                yield line.rstrip('\n')

def backsort(lines):
    result = {}
    for line in lines:
        result['.'.join(reversed(line.split('.')))] = line
    return map(lambda item: item[1],
               sorted(result.items()))

def main():
    if sys.argv[1:]:
        result = backsort(cat(sys.argv[1:]))
    else:
        result = backsort(map(
                lambda line: line.rstrip('\n'), sys.stdin))
    for line in result:
        print(line)

if __name__ == '__main__':
    main()

The complete backsort code in C++:

#include <assert.h>         // assert
#include <algorithm>        // std::reverse/transform
#include <fstream>          // std::ifstream
#include <functional>       // std::bind
#include <iostream>         // std::cin/cout
#include <map>              // std::map
#include <string>           // std::string
#include <vector>           // std::vector
#include <boost/coroutine2/all.hpp>

using namespace std;

typedef boost::coroutines2::coroutine<const string&>
    coro_t;

void cat(coro_t::push_type& yield,
         int argc, char* argv[])
{
    for (int i = 1; i < argc; ++i) {
        ifstream ifs(argv[i]);
        for (;;) {
            string line;
            if (getline(ifs, line)) {
                yield(line);
            } else {
                break;
            }
        }
    }
}

vector<string> split(const string& str, char delim)
{
    vector<string> result;
    string::size_type last_pos = 0;
    string::size_type pos = str.find(delim);
    while (pos != string::npos) {
        result.push_back(
            str.substr(last_pos, pos - last_pos));
        last_pos = pos + 1;
        pos = str.find(delim, last_pos);
        if (pos == string::npos) {
            result.push_back(str.substr(last_pos));
        }
    }
    return result;
}

template <typename C>
string join(const C& str_list, char delim)
{
    string result;
    for (auto& item : str_list) {
        result += item;
        result += delim;
    }
    if (result.size() != 0) {
        result.resize(result.size() - 1);
    }
    return result;
}

template <typename C>
vector<string> backsort(C&& lines)
{
    map<string, string> rmap;
    for (auto& line : lines) {
        auto split_line = split(line, '.');
        reverse(split_line.begin(), split_line.end());
        rmap[join(split_line, '.')] = line;
    }
    vector<string> result(rmap.size());
    transform(rmap.begin(), rmap.end(), result.begin(),
              [](const pair<string, string>& pr)
              {
                  return pr.second;
              });
    return result;
}

class istream_line_reader {
public:
    class iterator { // implements InputIterator
    public:
        typedef const string& reference;
        typedef string value_type;

        iterator() : stream_(nullptr) {}
        explicit iterator(istream& is) : stream_(&is)
        {
            ++*this;
        }

        reference operator*()
        {
            assert(stream_ != nullptr);
            return line_;
        }
        value_type* operator->()
        {
            assert(stream_ != nullptr);
            return &line_;
        }
        iterator& operator++()
        {
            getline(*stream_, line_);
            if (!*stream_) {
                stream_ = nullptr;
            }
            return *this;
        }
        iterator operator++(int)
        {
            iterator temp(*this);
            ++*this;
            return temp;
        }

        bool operator==(const iterator& rhs) const
        {
            return stream_ == rhs.stream_;
        }
        bool operator!=(const iterator& rhs) const
        {
            return !operator==(rhs);
        }

    private:
        istream* stream_;
        string line_;
    };

    explicit istream_line_reader(istream& is)
        : stream_(is)
    {
    }
    iterator begin() const
    {
        return iterator(stream_);
    }
    iterator end() const
    {
        return iterator();
    }

private:
    istream& stream_;
};

int main(int argc, char* argv[])
{
    using namespace std::placeholders;
    vector<string> result;
    if (argc > 1) {
        result = backsort(coro_t::pull_type(
            boost::coroutines2::fixedsize_stack(),
            bind(cat, _1, argc, argv)));
    } else {
        result = backsort(istream_line_reader(cin));
    }
    for (auto& item : result) {
       cout << item << endl;
    }
}

The istream_line_reader class is not really necessary, and we can simplify it with coroutines. I am including it here only to show what we have to write ‘normally’ (if we cannot use coroutines). Even if we remove it entirely, the C++ version will still have about three times as many non-trivial lines of code as the Python equivalent. It is enough proof to me that I should move away from C++ a little bit. . . .


  1. There is one gotcha: the ‘\n’ character will be part of the string. It will be handled in my solution. 
  2. Generally speaking, callbacks or similar techniques are what C++ programmers tend to use in similar circumstances, if the ‘producer’ part is complicated (otherwise the iterator pattern may be more suitable). Unfortunately, we cannot then combine the use of two simple functions like cat and backsort simultaneously. If we used callbacks, backsort would need to be modified and fragmented. 
  3. P0057 is one such effort, which is experimentally implemented in Visual Studio 2015
  4. According to the acknowledgement pages of next two Boost projects, Giovanni Deretta contributed to them. So his work was not in vain. 
  5. This said, CO2 is also well-maintained, and is more efficient if only a stackless coroutine is needed. See Jamboree’s answer on StackOverflow. The difference looks small to me, and preprocessor tricks are not my favourites, so I will not spend more time on CO2 for now. 
Advertisements

A Small Experiment of System Scripting in Python

My main laptop is still on Mac OS X Lion (10.7). I know I am guilty of exposing my laptop to potential security risks,1 but some of my paid applications do not work on newer OS X versions without an upgrade. I am an austere person and do not want to pay the money yet. In addition, I am also a little bit nostalgic about the skeuomorphic design, though I know some day I will have to use a Mac that has the latest macOS version in order to use new applications. Anyway, I am just procrastinating now, until some sexy new laptop from Apple makes me take out my wallet, or my old laptop goes crazy.

Sorry for this verbose beginning. What I really want to whine about is that Homebrew has stopped supporting my obsolete version of OS X, and I am relying more and more on MacPorts.2 I even had to rebuild most of my ‘ports’ (the term for packages in MacPorts) because the ‘standard’ way of building ports on Lion does not use libc++, while it is necessary for some ports.3 Unlike Homebrew, MacPorts does not show whether a dependency of a port is already installed or not. Worse, MacPorts packages often have heavy dependencies. For example, the command-line tool mkvtoolnix currently has 20 (recursive) dependencies in Homebrew, but 60 dependencies in MacPorts. My default compiler is clang-3.7, which has 46 dependencies. That pretty much makes the ‘port rdeps’ command useless.

A Google search showed this port command could be helpful:

port echo rdepof:PORT_NAME and not installed

However, more investigation showed there were several problems:

  • One cannot specify variants (like ‘+openmp’).
  • An option (like ‘configure.compiler=macports-clang-3.7’) can affect dependencies, but options do not have the intended effect in the ‘port echo’ command.
  • The recursion is not ‘cut’ when a port is already installed, which can result in unnecessary ports.

This problem had fretted me for some time, before I finally decided to take some action. Naturally, the ultimate solution is write some code. I normally use Bash or Perl for such scripting tasks, but, as I have become more and more interested in Python recently, I decided also to give Python a try to see how it handles such tasks.

I first wrote a Bash version for comparison purposes. It was not recursive, though (too cumbersome for Bash):

#!/bin/bash
function escape {
  printf "%s" "$1" | sed 's/[.*\[]/\\&/g'
}

INSTALLED=`port installed \
         | sed -n 's/^  \([A-Za-z_][^ ]*\).*/-e ^\1$/p'`
INSTALLED_ESC=`escape "$INSTALLED"`
port deps "$@" | sed -n 's/.*Dependencies:[[:space:]]*//p' \
               | sed $'s/, /\\\n/g' \
               | sort \
               | uniq \
               | grep -v $INSTALLED_ESC

Let me explain the code quickly (assuming you are familiar with the basic use of Bash and common Unix tools). ‘port installed’ returns the installed ports, and every line beginning with two spaces are port names followed by other information (like version). I retrieve the port names, and wrap each of them with ‘-e ^…$’. Since they will be used for grep, special characters need to be escaped (practically only ‘.’). I then invoke ‘port deps’ with the command-line arguments, look for lines containing ‘Dependencies:’, get everything after it, split at the commas to get the depended ports, sort the ports, remove duplicates, and filter out all installed ports from the result.

It basically works, and the code is succinct. It is also far from elegant, and quite error-prone. A Bash function feels like a hack. The quotation rules are tricky (when invoking escape, $INSTALLED must be quoted; but when invoking grep, $INSTALLED_ESC must not be quoted). Escaping can easily get problematic when used inside quotation marks. And so on. . . . It is difficult to imagine people can write Bash scripts without some trial and error, even though only a few lines are written.

I knew some Python, but I am not very familiar with it. So I was basically writing while Googling. I got the first version, sort of an equivalent of the Bash script, in about two hours:

#!/usr/bin/env python
#coding: utf-8

import re
import sys
import subprocess

# Gets command output as a list of lines
def popen_readlines(cmd):
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
    p.wait()
    if p.returncode != 0:
        raise subprocess.CalledProcessError(p.returncode, \
                                            cmd)
    else:
        return map(lambda line: line.rstrip('\n'), \
                   p.stdout.readlines())

# Gets the port name from a line like
# "  gcc6 @6.1.0_0 (active)"
def get_port_name(port_line):
    return re.sub(r'^  (\S+).*', r'\1', port_line)

# Gets installed ports as a set
def get_installed():
    installed_ports_lines = \
            popen_readlines(['port', 'installed'])[1:]
    installed_ports = \
            set(map(get_port_name, installed_ports_lines))
    return installed_ports

# Gets dependencies for the given port list (which may
# contain options etc.), as a list, excluding items in
# ignored_ports
def get_deps(ports, ignored_ports):
    deps_raw = popen_readlines(['port', 'deps'] + ports)
    uninstalled_ports = []
    for line in deps_raw:
        if re.search(r'Dependencies:', line):
            deps = re.sub(r'.*Dependencies:\s*', '', \
                          line).split(', ')
            uninstalled_ports += \
                [x for x in deps if x not in ignored_ports]
            ignored_ports |= set(deps)
    return uninstalled_ports

def main():
    if sys.argv[1:]:
        installed_ports = get_installed()
        uninstalled_ports = get_deps(sys.argv[1:], \
                                     installed_ports)
        for port in uninstalled_ports:
            print port

if __name__ == '__main__':
    main()

A few things immediately came to notice:

  • The code is apparently more verbose than Bash or Perl, but arguably also clearer and more readable.
  • Strings are ubiquitous in Bash, but lists are ubiquitous in Python. Python allowed backticks (`…`) for piping, but they are deprecated now in favour of the subprocess routines, which accept the command line as a list.
  • The set is a built-in type and is a breeze to use.
  • I/O is not as easy as in Perl (thinking of <> and chomp now), but can be easily simplified with helper functions, as composability is very good.
  • List comprehension and map are very helpful to keep the code concise.

It is not all. The real fun was that it was easy to convert the code to work recursively on all depended ports. I only needed to add/change seven lines of code, at the beginning and end of get_deps:

def get_deps(ports, ignored_ports):
    # New code to end the recursion
    if ports == []:
        return []

    # This part is not changed
    deps_raw = popen_readlines(['port', 'deps'] + ports)
    uninstalled_ports = []
    for line in deps_raw:
        if re.search(r'Dependencies:', line):
            deps = re.sub(r'.*Dependencies:\s*', '', \
                          line).split(', ')
            uninstalled_ports += \
                [x for x in deps if x not in ignored_ports]
            ignored_ports |= set(deps)

    # New code to call recursively and collect the result
    results = []
    for port in uninstalled_ports:
        results.append(port)
        results += get_deps([port], ignored_ports)
    return results

The output did not show any indentation yet, and I found another problem later. The improved final code looks as follows:

#!/usr/bin/env python
#coding: utf-8

import re
import sys
import subprocess

# Gets command output as a list of lines
def popen_readlines(cmd):
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
    p.wait()
    if p.returncode != 0:
        raise subprocess.CalledProcessError(p.returncode, \
                                            cmd)
    else:
        return map(lambda line: line.rstrip('\n'), \
                   p.stdout.readlines())

# Gets the port name from a line like
# "  gcc6 @6.1.0_0 (active)"
def get_port_name(port_line):
    return re.sub(r'^  (\S+).*', r'\1', port_line)

# Gets installed ports as a set
def get_installed():
    installed_ports_lines = \
            popen_readlines(['port', 'installed'])[1:]
    installed_ports = \
            set(map(get_port_name, installed_ports_lines))
    return installed_ports

# Gets port names from items that may contain version
# specifications, variants, or options
def get_ports(ports_and_specs):
    requested_ports = set()
    for item in ports_and_specs:
        if not (re.search(r'^[-+@]', item) or \
                re.search(r'=', item)):
            requested_ports.add(item)
    return requested_ports

# Gets dependencies for the given port list (which may
# contain options etc.), as a list of tuples (combining
# with level), excluding items in ignored_ports
def get_deps(ports, ignored_ports, level):
    if ports == []:
        return []

    deps_raw = popen_readlines(['port', 'deps'] + ports)
    uninstalled_ports = []
    for line in deps_raw:
        if re.search(r'Dependencies:', line):
            deps = re.sub(r'.*Dependencies:\s*', '', \
                          line).split(', ')
            uninstalled_ports += \
                [x for x in deps if x not in ignored_ports]
            ignored_ports |= set(deps)

    port_level_pairs = []
    for port in uninstalled_ports:
        port_level_pairs += [(port, level)]
        port_level_pairs += get_deps([port], \
                                     ignored_ports, \
                                     level + 1)
    return port_level_pairs

def main():
    if sys.argv[1:]:
        ports_and_specs = sys.argv[1:]
        ignored_ports = get_installed() | \
                        get_ports(ports_and_specs)
        uninstalled_ports = get_deps(ports_and_specs, \
                                     ignored_ports, 0)
        for (port, level) in uninstalled_ports:
            print ' ' * (level * 2) + port

if __name__ == '__main__':
    main()

I would say I am very happy, even excited, with the experiment results. No wonder Python has been a great success, despite being verbose and having a slightly weird syntax :-). I guess I would do more Python in the future.

By the way, the code in this article is in Python 2. Python 3 is stricter and even more verbose: I do not see the benefits of using it for system scripting (yet).


  1. Not really. My MacBook Pro has the firewall turned on, it is behind the home router nearly at all times, and I do not visit strange web sites—not with Safari at least. 
  2. Honestly, it is not the fault of Homebrew, or even Apple. However, I do miss the support lifecycle that Microsoft provided for Windows XP. 
  3. For more details, Using libc++ on older system explains the why and how.