Upgrading to Boost 1.61 in MacPorts

The Boost version in MacPorts was still 1.59.0—a year old now. When I wrote about Boost.Coroutine2, I found I had to install the latest Boost version 1.61.0. So I had two sets of Boost libraries on my hard drive, which made things . . . er . . . a little bit complicated. After I built Microsoft’s cpprestsdk last night—I managed to make it find and use the MacPorts Boost libraries—I feel more urged to change the situation. So this morning I subscribed to the MacPorts mailing list and posted the question about the outdated version problem. With the help from Mr Michael Dickens and Google, I have a working port of Boost 1.61.0 now. This article will document the procedure how it works.

The first thing one needs to do is check out the port files from the MacPorts Subversion repository. In my case, The boost files are under devel/boost. So I checked out only the boost directory into ~/Programming/MacPorts/devel.

One then needs to tell MacPorts to look for ports in that directory. There are two steps involved:

  1. Add the URL of the local ports directory (e.g. ‘file:///Users/yongwei/Programming/MacPorts’ in my case) to /opt/local/etc/macports/sources.conf, above the default rsync URL.
  2. Run the portindex command under that directory. It needs to be rerun every time a Portfile is changed.

Now MacPorts should find ports first in my local ports directory and then the system default. And I could begin patching the files.

It turned out that people tried to update boost half a year ago for Boost 1.60, but they found there were failing ports and the ABI was incompatible with 1.59. The patch was still good to me, as I had now a good example. I simply applied the patch, ran portindex again, and went ahead to port upgrade boost.

The procedure turned out quite smooth, though mkvtoolnix, the only installed port that depended on boost on my laptop, failed to run after the upgrade. I had to port uninstall it and then port install it again (rebuilding it).

After I had some confidence, I began to change the port files. I changed first Portfile, which contained the version information and file checksums. Updating them was trivial. When I could see the new version 1.61.0 from port info boost, I kicked off the build with port upgrade boost again.

Then came the more painful process of fixing the patch files under devel/boost/files (the ‘patch’ I mentioned a moment ago actually contained patches for these patch files). Most of these MacPorts-specific patch files could be applied without any problems, but one of them failed. It was actually due to trivial code changes in Boost, but I still had to check all the rejections, manually apply the changes, and generate a new patch file. After that, everything went on smoothly.

Against all my hopes, I found that I had to rebuild mkvtoolnix yet again. So the ABI instability is really an issue, and I understand now why boost was stuck at the old version for such a long time. However, I consider my task completed, when I uploaded the updated patch to the MacPorts ticket. At least I have the new working port of boost for myself now. And you can have it too.

Advertisements

A Small Experiment of System Scripting in Python

My main laptop is still on Mac OS X Lion (10.7). I know I am guilty of exposing my laptop to potential security risks,1 but some of my paid applications do not work on newer OS X versions without an upgrade. I am an austere person and do not want to pay the money yet. In addition, I am also a little bit nostalgic about the skeuomorphic design, though I know some day I will have to use a Mac that has the latest macOS version in order to use new applications. Anyway, I am just procrastinating now, until some sexy new laptop from Apple makes me take out my wallet, or my old laptop goes crazy.

Sorry for this verbose beginning. What I really want to whine about is that Homebrew has stopped supporting my obsolete version of OS X, and I am relying more and more on MacPorts.2 I even had to rebuild most of my ‘ports’ (the term for packages in MacPorts) because the ‘standard’ way of building ports on Lion does not use libc++, while it is necessary for some ports.3 Unlike Homebrew, MacPorts does not show whether a dependency of a port is already installed or not. Worse, MacPorts packages often have heavy dependencies. For example, the command-line tool mkvtoolnix currently has 20 (recursive) dependencies in Homebrew, but 60 dependencies in MacPorts. My default compiler is clang-3.7, which has 46 dependencies. That pretty much makes the ‘port rdeps’ command useless.

A Google search showed this port command could be helpful:

port echo rdepof:PORT_NAME and not installed

However, more investigation showed there were several problems:

  • One cannot specify variants (like ‘+openmp’).
  • An option (like ‘configure.compiler=macports-clang-3.7’) can affect dependencies, but options do not have the intended effect in the ‘port echo’ command.
  • The recursion is not ‘cut’ when a port is already installed, which can result in unnecessary ports.

This problem had fretted me for some time, before I finally decided to take some action. Naturally, the ultimate solution is write some code. I normally use Bash or Perl for such scripting tasks, but, as I have become more and more interested in Python recently, I decided also to give Python a try to see how it handles such tasks.

I first wrote a Bash version for comparison purposes. It was not recursive, though (too cumbersome for Bash):

#!/bin/bash
function escape {
  printf "%s" "$1" | sed 's/[.*\[]/\\&/g'
}

INSTALLED=`port installed \
         | sed -n 's/^  \([A-Za-z_][^ ]*\).*/-e ^\1$/p'`
INSTALLED_ESC=`escape "$INSTALLED"`
port deps "$@" | sed -n 's/.*Dependencies:[[:space:]]*//p' \
               | sed $'s/, /\\\n/g' \
               | sort \
               | uniq \
               | grep -v $INSTALLED_ESC

Let me explain the code quickly (assuming you are familiar with the basic use of Bash and common Unix tools). ‘port installed’ returns the installed ports, and every line beginning with two spaces are port names followed by other information (like version). I retrieve the port names, and wrap each of them with ‘-e ^…$’. Since they will be used for grep, special characters need to be escaped (practically only ‘.’). I then invoke ‘port deps’ with the command-line arguments, look for lines containing ‘Dependencies:’, get everything after it, split at the commas to get the depended ports, sort the ports, remove duplicates, and filter out all installed ports from the result.

It basically works, and the code is succinct. It is also far from elegant, and quite error-prone. A Bash function feels like a hack. The quotation rules are tricky (when invoking escape, $INSTALLED must be quoted; but when invoking grep, $INSTALLED_ESC must not be quoted). Escaping can easily get problematic when used inside quotation marks. And so on. . . . It is difficult to imagine people can write Bash scripts without some trial and error, even though only a few lines are written.

I knew some Python, but I am not very familiar with it. So I was basically writing while Googling. I got the first version, sort of an equivalent of the Bash script, in about two hours:

#!/usr/bin/env python
#coding: utf-8

import re
import sys
import subprocess

# Gets command output as a list of lines
def popen_readlines(cmd):
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
    p.wait()
    if p.returncode != 0:
        raise subprocess.CalledProcessError(p.returncode, \
                                            cmd)
    else:
        return map(lambda line: line.rstrip('\n'), \
                   p.stdout.readlines())

# Gets the port name from a line like
# "  gcc6 @6.1.0_0 (active)"
def get_port_name(port_line):
    return re.sub(r'^  (\S+).*', r'\1', port_line)

# Gets installed ports as a set
def get_installed():
    installed_ports_lines = \
            popen_readlines(['port', 'installed'])[1:]
    installed_ports = \
            set(map(get_port_name, installed_ports_lines))
    return installed_ports

# Gets dependencies for the given port list (which may
# contain options etc.), as a list, excluding items in
# ignored_ports
def get_deps(ports, ignored_ports):
    deps_raw = popen_readlines(['port', 'deps'] + ports)
    uninstalled_ports = []
    for line in deps_raw:
        if re.search(r'Dependencies:', line):
            deps = re.sub(r'.*Dependencies:\s*', '', \
                          line).split(', ')
            uninstalled_ports += \
                [x for x in deps if x not in ignored_ports]
            ignored_ports |= set(deps)
    return uninstalled_ports

def main():
    if sys.argv[1:]:
        installed_ports = get_installed()
        uninstalled_ports = get_deps(sys.argv[1:], \
                                     installed_ports)
        for port in uninstalled_ports:
            print port

if __name__ == '__main__':
    main()

A few things immediately came to notice:

  • The code is apparently more verbose than Bash or Perl, but arguably also clearer and more readable.
  • Strings are ubiquitous in Bash, but lists are ubiquitous in Python. Python allowed backticks (`…`) for piping, but they are deprecated now in favour of the subprocess routines, which accept the command line as a list.
  • The set is a built-in type and is a breeze to use.
  • I/O is not as easy as in Perl (thinking of <> and chomp now), but can be easily simplified with helper functions, as composability is very good.
  • List comprehension and map are very helpful to keep the code concise.

It is not all. The real fun was that it was easy to convert the code to work recursively on all depended ports. I only needed to add/change seven lines of code, at the beginning and end of get_deps:

def get_deps(ports, ignored_ports):
    # New code to end the recursion
    if ports == []:
        return []

    # This part is not changed
    deps_raw = popen_readlines(['port', 'deps'] + ports)
    uninstalled_ports = []
    for line in deps_raw:
        if re.search(r'Dependencies:', line):
            deps = re.sub(r'.*Dependencies:\s*', '', \
                          line).split(', ')
            uninstalled_ports += \
                [x for x in deps if x not in ignored_ports]
            ignored_ports |= set(deps)

    # New code to call recursively and collect the result
    results = []
    for port in uninstalled_ports:
        results.append(port)
        results += get_deps([port], ignored_ports)
    return results

The output did not show any indentation yet, and I found another problem later. The improved final code looks as follows:

#!/usr/bin/env python
#coding: utf-8

import re
import sys
import subprocess

# Gets command output as a list of lines
def popen_readlines(cmd):
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
    p.wait()
    if p.returncode != 0:
        raise subprocess.CalledProcessError(p.returncode, \
                                            cmd)
    else:
        return map(lambda line: line.rstrip('\n'), \
                   p.stdout.readlines())

# Gets the port name from a line like
# "  gcc6 @6.1.0_0 (active)"
def get_port_name(port_line):
    return re.sub(r'^  (\S+).*', r'\1', port_line)

# Gets installed ports as a set
def get_installed():
    installed_ports_lines = \
            popen_readlines(['port', 'installed'])[1:]
    installed_ports = \
            set(map(get_port_name, installed_ports_lines))
    return installed_ports

# Gets port names from items that may contain version
# specifications, variants, or options
def get_ports(ports_and_specs):
    requested_ports = set()
    for item in ports_and_specs:
        if not (re.search(r'^[-+@]', item) or \
                re.search(r'=', item)):
            requested_ports.add(item)
    return requested_ports

# Gets dependencies for the given port list (which may
# contain options etc.), as a list of tuples (combining
# with level), excluding items in ignored_ports
def get_deps(ports, ignored_ports, level):
    if ports == []:
        return []

    deps_raw = popen_readlines(['port', 'deps'] + ports)
    uninstalled_ports = []
    for line in deps_raw:
        if re.search(r'Dependencies:', line):
            deps = re.sub(r'.*Dependencies:\s*', '', \
                          line).split(', ')
            uninstalled_ports += \
                [x for x in deps if x not in ignored_ports]
            ignored_ports |= set(deps)

    port_level_pairs = []
    for port in uninstalled_ports:
        port_level_pairs += [(port, level)]
        port_level_pairs += get_deps([port], \
                                     ignored_ports, \
                                     level + 1)
    return port_level_pairs

def main():
    if sys.argv[1:]:
        ports_and_specs = sys.argv[1:]
        ignored_ports = get_installed() | \
                        get_ports(ports_and_specs)
        uninstalled_ports = get_deps(ports_and_specs, \
                                     ignored_ports, 0)
        for (port, level) in uninstalled_ports:
            print ' ' * (level * 2) + port

if __name__ == '__main__':
    main()

I would say I am very happy, even excited, with the experiment results. No wonder Python has been a great success, despite being verbose and having a slightly weird syntax :-). I guess I would do more Python in the future.

By the way, the code in this article is in Python 2. Python 3 is stricter and even more verbose: I do not see the benefits of using it for system scripting (yet).


  1. Not really. My MacBook Pro has the firewall turned on, it is behind the home router nearly at all times, and I do not visit strange web sites—not with Safari at least. 
  2. Honestly, it is not the fault of Homebrew, or even Apple. However, I do miss the support lifecycle that Microsoft provided for Windows XP. 
  3. For more details, Using libc++ on older system explains the why and how.