Quantcast
Channel: Object Mentor Blog: Category Dean's Deprecations
Viewing all articles
Browse latest Browse all 46

Observations on TDD in C++ (long)

$
0
0

I spent all of June mentoring teams on TDD in C++ with some Java. While C++ was my language of choice through most of the 90’s, I think far too many teams are using it today when there are better options for their particular needs.

During the month, I took notes on all the ways that C++ development is less productive than development in languages like Java, particular if you try to practice TDD. I’m not trying to start a language flame war. There are times when C++ is the appropriate tool, as we’ll see.

Most of the points below have been discussed before, but it is useful to list them in one place and to highlight a few particular observations.

Based on my observations last month, as well as previously experience, I’ve come to the conclusion that TDD in C++ is about an order of magnitude slower than TDD in Java. Mostly, this is due to poor or non-existent tool support for automated refactorings, no error detection as you type, and the requirement to compile and link an executable test.

So, here is my list of impediments that I encountered last month. I’ll mostly use Java as the comparison language, but the arguments are more or less the same for C# and the popular dynamic languages, like Ruby, Python, and Smalltalk. Note that the dynamic languages tend to have less complete tool support, but they make up for it in other ways (off-topic for this blog).

Getting Started

There is more setup effort involved in configuring your build environment to use your chosen unit testing framework (e.g., CppUnit) and to create small executables, one each for a single or a few tests. Creating many small tests, rather than one big test (e.g., a variant of the actual application). This is important to minimize the TDD cycle.

Fortunately, this setup is a one-time “charge”. The harder part, if you have legacy code, is refactoring it to break hard dependencies so you can write unit tests. This is true for legacy code in any language, of course.

Complex Syntax

C++ has a very complex syntax. This makes it hard to parse, limiting the capabilities of automated tools and slowing build times (more below).

The syntax also makes it harder to program in the language and not just for novices. Even for experts, the visual noise of pointer and reference syntax obscures the story the code is trying to tell. That is, C++ code is inherently less clean than code in most other languages in widespread use.

Also, the need for the developer to remember whether each variable is a pointer, a reference, or a “value”, and how to manage its life-cycle, requires mental effort that could be applied to the logic of the code instead.

Obsolete Tool Support

No editor or IDE supports non-trivial, automated refactorings. (Some do simple refactorings like “rename”.) This means you have to resort to tedious, slow, and error-prone manual refactorings. Extract Method is made worse by the fact that you usually have to edit two files, an implementation and a header file.

There are no widely-used tools that provide on-the-fly parsing and error indications. This alone increases the time between typing an error and learning about it by an order of magnitude. Since a build is usually required, you tend to type a lot between builds, thereby learning about many errors at once. Working through them takes time. (There may be some commercial tools with limited support for on-the-fly parsing, but they are not widely used.)

Similarly, none of the common development tools support incremental loading of object code that could be used for faster unit testing and hence a faster TDD cycle. Most teams just build executables. Even when they structure the build process to generate small, focused executables for unit tests, the TDD cycle times remain much longer than for Java.

Finally, while there is at least one mocking framework available for C++, it is much harder to use than comparable frameworks in newer languages.

Manual Memory Management

We all know that manual memory management leads to time spent finding and fixing memory errors and leaks. Avoiding these problems in the first place also consumes a lot of thought and design effort. In Java, you just spend far less time thinking about “who owns this object and is therefore responsible for managing its life-cycle”.

Dependency Management

Intelligent handling of include directives is entirely up to the developer. We have all used the following “guard” idiom:

    #ifndef MY_CLASS_H
    #define MY_CLASS_H
    ...
    #endif

Unfortunately, this isn’t good enough. The file will still get opened and read in its entirety every time it is included. You could also put the guard directives around the include statement:

    #ifndef MY_CLASS_H
    #include "myclass.h"
    #endif

This is tedious and few people do it, but it does avoid the wasted file I/O.

Finally, too few people simply declare a required class with no body:

    class MyClass;

This is sufficient when one header references another class as a pointer or reference. In our experience with clients, we have often seen build times improve significantly when teams cleaned up their header file usage and dependencies, in general. Still, why is all this necessary in the 21st century?

This problem is made worse by the unfortunate inclusion of private and protected declarations in the same header file included by clients of the class. This creates phantom dependencies from the clients to class details that they can’t access directly.

Other Debugging Issues

Limited or non-existent context information when an exception is thrown makes the origin of the exception harder to find. To fill the gap, you tend to spend more time adding this information manually through logging statements in catch blocks, etc.

The std::exception class doesn’t appear to have a std::string or const char* argument in a constructor for a message. You could just throw a string, but that precludes using an exception class with a meaningful name.

Compiler error messages are hard to read and often misleading. In part this is due to the complexity of the syntax and the parsing problem mentioned previously. Errors involving template usage are particular hard to debug.

Reflection and Metaprogramming

Many of the productivity gains from using dynamic languages and (to a lesser extent) Java and C# are due to their reflection and metaprogramming facilities. C++ relies more on template metaprogramming, rather than APIs or other built-in language features that are easier to use and more full-featured. Preprocessor hacks are also used frequently. Better reflection and metaprogramming support would permit more robust proxy or aspect solutions to be used. (However, to be fair, sometimes a preprocessor hack has the virtue of being “the simplest thing that could possibly work.”)

Library Issues

Speaking of std::string and char*, it is hard to avoid writing two versions of methods, one which takes const std::string& arguments and one which takes const char* arguments. It doesn’t matter that one method can usually delegate to the other one; this is wasted effort.

Discussion

So, C++ makes it hard for me to work the way that I want to work today, which is test-driven, creating clean code that works. That’s why I rarely choose it for a project.

However, to be fair, there are legitimate reasons for almost all of the perceived “deficiencies” listed above. C++ emphasizes performance and backwards-compatibility with C over all other considerations. However, they come at the expense of other interests, like effective TDD.

It is a good thing that we have languages that were designed with performance as the top design goal, because there are circumstances where performance is the number one requirement. However, most teams that use C++ as their primary language are making an optimal choice for, say, 10% of their code, but which is suboptimal the other 90%. Your numbers will vary; I picked 10% vs. 90% based on the fact that performance bottlenecks are usually localized and they should be found by actual measurements, not guesses!

Workarounds

If it’s true that TDD is an order of magnitude slower for C++ then what do we do? No doubt really good C++ developers have optimized their processes as best as they can, but in the end, you will just have to live with longer TDD cycles. Instead of write just enough test to fail, make it pass, refactor, it will be more like write a complete test, write the implementation, build it, fix the compilation errors, run it, fix the logic errors to make the test pass, and then refactor.

A Real Resolution?

You could consider switching to the D language, which is link compatible with C and appears to avoid many of the problems described above.

There is another way out of the dilemma of needing optimal performance some of the time and optimal productivity the rest of the time; use more than one language. I’ll discuss this idea in my next blog.


Viewing all articles
Browse latest Browse all 46

Trending Articles