Why boost doesn’t boost my productivity

by tanjeff

I’ve been working with the boost libraries for several years now. My experience is that boost doesn’t boost a developers productivity. I recently started using Qt and found that development becomes faster and my code gets shorter and more concise. However, I guess that using C++ in the first place isn’t contemporary any more. For example, Java seems to have better tooling and debugging support and more complete libraries (at least, Java developers told me so). And even Java is already getting competition, e.g. from Scala. Script languages like python are also suitable for application development, and developers can be more productive with them than with C++.

But this post concerns boost. So let me explain why I consider the boost libraries to constrain developers productivity.

The C++03 standard template library provides some important functionality like containers, strings and I/O streams. However, it also lacks a lot of features. For example, the std::string class doesn’t have a trim() function. Also, the STL has no support for threads, smart pointers, networking or date/time calculation. Here, the boost libraries jump in. The boost API follows the design principles of the STL API and provides a similar feeling. By the way: this is the reason why the boost problems also apply to STL (or vice versa?). The STL, combined with boost, make up a more-or-less complete standard library. Some of the boost libraries even made it into the C++11 standard, so the STL is more complete now, but the functionality of STL+boost didn’t change much, as far as I know.

It turns out that the boost APIs tend to make easy things hard. For example, I once had to split a string at semicolons: “first;second;third”. I easily found the split() function provided by boost. The split() reference documentation states:

template<typename SequenceSequenceT,
         typename RangeT,
         typename PredicateT>
SequenceSequenceT &
split(SequenceSequenceT & Result, RangeT & Input, PredicateT Pred,
      token_compress_mode_type eCompress = token_compress_off);

The documentation says that one can provide a std::string as Input and a std::vector<> as Result. The Pred parameter must be a function returning true for elements which are considered separators. However, the required function prototype is not given. Furthermore, it is not stated that boost already provides such a function: is_any_of() (and if you look at its documentation you will notice that it takes a “RangeT” as parameter, and that it remains unclear what the heck you can actually feed into it). There is also no example, nor a link to an example, although the documentation actually has an example (which also suggests that you can give a “const char*” to is_any_of()). This example in turn has no link to the reference documentation for split() or is_any_of(). And by the way: what does “token_compress_off” mean? Badly organized documentation is typical for boost and makes it hard to learn it. For comparison, here is how QString’s split() method is documented (QT Version 4.8):

QStringList QString::split (const QString & sep,
                SplitBehavior behavior = KeepEmptyParts,
                Qt::CaseSensitivity cs = Qt::CaseSensitive ) const

To me, this looks pretty simple, and it is indeed simple to use and works out of the box. To be honest, I must admit that boost::split() is more versatile. For example, you can store the result into a container of your choice, not only std::vector<>, while QString::split() always returns a QStringList. However, it is easy to convert the QStringList into a Qset, QVector or std::list container.

Next, here is an example demonstrating how to generate a random number using boost:

boost::random::mt19937 rng;
boost::random::uniform_int_distribution<> six(1,6);
int x = six(rng);

Only three lines of code, but I hardly understand any of them. Why do I need to create two objects to generate a single random number? What does the class name mt19937 mean? It turns out that mt19937 is a random generation algorithm, and there are more of them (e.g. taus88 or mt11213b). Good for crypto people, but to simulate a die, it’s too complicated, so I’d prefer to go with the plain old C rand() function. Badly designed, over-generalized  API’s are typical for boost. The boost::split() function is also an example of this: why is it not possible to give the separators as string instead of using a predicate? At least an overloaded function could have been provided.

Another experience was the network code I wrote for agentXcpp using boost::asio. I wanted a read() method with a timeout, but boost::asio doesn’t have one. I implemented a quite complicated machinery, with an asynchronous timer and an asynchronous read operation, waited until one of them completed, then looked up which one it was. But it was not possible to stop the timer without firing its callback, so the callback had to handle erroneous invocations. And in case of a timeout, the read operation could also not be stopped. All these nifty details raped a lot of time, and the resulting code was complicated and hard to understand: Easy things are hard to do with boost. By switching to QtNetwork, the code size halved, and the code got a lot more understandable. And it took me only two days to write it, even though I never programmed Qt before. Boost didn’t boost me here, while Qt did.

In another project I used boost::variant in conjunction with boost::serialization. I created a boost variant which could hold a long, double, std::string, or even a std::list of itself, i.e. it was a recursive variant. This was not too hard to implement. Then, I created the methods serialize() and deserialize() to convert such variants into strings and back, using boost::serialization. Boost::serialization supports boost::variant out of the box, even with recursion and all, so the task was easy to do, and worked like a charm. At first. Then, problems evolved: someone stored the double value ‘NaN’ into such a variant, serialized it, which worked, then deserialized it again, which failed. We had some trouble finding the error, because the problem occurred at another place than where the programming error was located. We finally found the error, and I had to add code to serialize() to handle ‘NaN’ values. Next, someone got a program crash with an unreadable exception message. Long, unreadable template error messages are also characteristic for boost. Again, it took us time to find the error. The problem was that the variant stuff was compiled with -ansi (using g++), while the code using it was compiled without -ansi. This is normally no problem, but boost::variant couldn’t handle it. And the cryptic error message didn’t help at all. After my boost::asio vs. QtNetwork experience I suppose that using Qt instead of boost would have saved us a lot of time (at least QVariant combined with QDataStream happily handles ‘NaN’ without problems; I tried it).

As I mentioned earlier, the STL is likewise hard to use. Want to find out whether an element is within a std::map? Use find() and compare to the end() iterator. In Qt you could use the QMap::contains() method, which is shorter and more concise. Want to remove b’s elements from a, where a and b are std::set’s? You may use std::set_difference(), which takes five iterators and need a third container to place the result into. Or you write your own code, which is even shorter than using std::set_difference(). Example with std::set_difference():

set<string> a;
set<string> b;

// Remove b's elements from a, using std::set_difference:
set<string> result;
set_difference(a.begin(), a.end(),
               b.begin(), b.end(),
               inserter(result, result.begin()));
a = result;

And here is a self-made version:

set<string> a;
set<string> b;

// Remove b's elements from a, self-made:
for(set<string>::iterator i = b.begin(); i != b.end(); ++i) {
    a.erase(*i);
}

Note that the first form even needs an inserter iterator, because std::set iterators have no random access possibility. For std::vector the inserter would not be needed, but you have to ensure that the vector is big enough, using resize(), since it is impossible to add elements to a container using only an iterator. Maybe it’s even easier to use an inserter iterator here, too. Using iterators for std::set_difference() is again an over-generalization, which makes live really hard in this case. For completeness, here is how to do it with Qt (using Qt types, though):

QSet<QString> a;
QSet<QString> b;

// Remove b's elements from a, QT style:
a.subtract(b);

This version is short and directly expresses what the code does. You need not to learn about “random access iterators” or how to obtain an inserter. You can simply do your job. And if you want the result to go to a third container, you could copy ‘a’ and then call subtract() on the copy.

To sum it up: boost is poorly documented, provides over-generalized, hard-to-use APIs and produces unreadable, misleading error messages, often caused by templates. The same is also true for the STL.

Note that all given examples evolved in real projects. I didn’t invent weird code snippets to blame boost or the STL. The record shows that boost and STL are simply inefficient in terms of developer productivity. I think that one can do better using Qt. Using C++11 also contributes some simplifications, e.g. when iterating over containers (although Qt can help here, too, with the foreach keyword). However, for new projects I personally would consider choosing another language. They exist for good reasons.

So one question remains: Why is agentXcpp actually written in C++? Well, when I started the project, I didn’t have much experience with C++, nor with STL or boost. I learned while working on the project. And as a consequence, I’m currently banning boost from agentXcpp, and I’m going to ban the STL, too. And who knows, maybe I will even ban C++ some day ;-)

About these ads