This week in Questions Five Ways I've assembled a group of testers and tool builders to talk about code analysis and code testing. Kevin Rutherford (@kevinrutherford) is the co-author of the upcoming Ruby Refactoring Workbook and the creator of reek. Andy Lester (@petdance) is a longtime proponent of testing in the Perl community. Kay Johansen (@utahkay) is an agile testing guru. Russ Olsen (@russolsen) is the author of Design Patterns in Ruby. I hope you enjoy the discussion, but please take a moment to add your thoughts in the comments.
What is the right interplay between testing code (unit test and the like) and static code analysis (lint, Reek, and their ilk)?
Kevin Tools such as Reek, lint, flay are intended to be helpers for the "refactor" step of TDD. Not everyone is good at finding the smells in their code, and I certainly find that I detect fewer smells in code that I'm familiar with; so having a tool to help my sense of smell can be quite beneficial.
But there are a couple of clear problems with using tools here, and so I'll start off our discussion by trying to phrase them clearly enough for group discussion...
Firstly, tools such as Reek attempt to make a subjective concept (code maintainability) into an objective one. In any codebase there will be code smells we're happy to tolerate — at least in the short term — and yet the tool will continue to nag us about them. (This is why Reek provides a config mechanism that allow certain code elements to be ignored.) On the plus side, Reek sees smells that I miss; and fixing them usually improves my code. But on the minus side, Reek can be more pedantic than I would be, and sometimes I have to spend time doing refactoring just to keep Reek happy. So, is it worth the effort to keep code quality tools quiet?
Andy Not everyone is familiar with these tools, in the Perl world we use perlcritic. Reek and perlcritic are very similar, and yet so very different. Here's what perlcritic tells me about some code (PBP stands for Perl Best Practices):
alester:~/ack : make critic perlcritic -1 -q -profile perlcriticrc ack-base Ack.pm Repository.pm Ack.pm: Use character classes for literal metachars instead of escapes at line 919, column 15. See page 247 of PBP. (Severity: 1) Ack.pm: Null statement (stray semicolon) at line 328, column 14. (no explanation). (Severity: 3) Ack.pm: Subroutine with high complexity score (31) at line 183, column 1. Consider refactoring. (Severity: 3) Repository.pm: Subroutine does not end with "return" at line 21, column 1. See page 197 of PBP. (Severity: 4) Repository.pm: Ambiguously named subroutine "close" at line 45, column 1. See page 48 of PBP. (Severity: 3) Repository.pm: Subroutine name is a homonym for builtin function at line 45, column 1. See page 177 of PBP. (Severity: 4)
Kevin Good point Andy. Reek's output looks like this:
"samples/optparse.rb" -- 116 warnings: OptionParser has at least 59 methods (Large Class) OptionParser#CompletingHash#match/block/block is nested (Nested Iterators) OptionParser#Completion::complete calls candidates.size multiple times (Duplication) OptionParser#Completion::complete calls k.id2name multiple times (Duplication) OptionParser#Completion::complete has approx 23 statements (Long Method) OptionParser#Completion::complete has the variable name 'k' (Uncommunicative Name) OptionParser#Completion::complete has the variable name 'v' (Uncommunicative Name) OptionParser#Completion::complete refers to candidates more than self (Feature Envy) OptionParser#Switch#RequiredArgument#parse is controlled by argument arg (Control Couple) OptionParser#Switch#initialize has 7 parameters (Long Parameter List) ...
Pat Let's go back to Kevin's question, it's a tough one. I like seeing all of the warnings from a tool like Reek. It makes me question how or why I've written something. On the other hand, having seen it once and having made a decision about it, continuing to see it becomes annoying. I hadn't known about configuring it to ignore code elements, thanks for pointing it out. Maybe it would be less annoying if tools like this that kept some metadata and separated new warnings from the ones you've seen before.
Andy It can be harder to manage the results of static analysis, because they're rarely binary, unlike unit tests which are pass/fail. With static analysis results, I like to track trends over time, since I rarely ever have a clean run of Perl::Critic or splint.
Static analysis tools are also hard to manage becasue their results can often lead to arguments, if only with yourself! Which automated exhortations do we need to follow, and which can we ignore? Is it OK to have this long loop in just this one case? And when you decide to ignore a setting, do you annotate the source code? Source code annotations are subject to the same bit rot as source code itself.
All that said, static analysis is very different with C, where lint, splint and gcc warnings may tell you about actual bugs where memory may get corrupted, not just stylistic improvements.
Russ The thing that you have to keep in mind is that the code analysis tools are optional helpers: They look into your code and tell you things that might be wrong. They might help, they might not and you can take them or leave them. Tests, on the other hand, are life and death: If you don't have tests or good tests or if your code isn't passing the tests that you do have, then you have no idea whether your system works or not. This is true no matter if you are using Java or C#, but it is particularly true when you are working with dynamically typed languages like Ruby.
Having said that, I think that one good thing that static analysis of code can do that is a real help is to enforce a uniform style on your code, so that everyone is coding more or less the same way. In the long term that has real value.
There is also an organizational aspect to this issue. I've worked for a lot of big, bureaucratic organizations, and the mindset that develops in those kind of situations is that anything that is not required should be forbidden. So you tend to get mandates: Your code shall have test coverage; You shall run this or that code analysis tool. Now it's hard to argue with requirements for test coverage, but mandating the use of this or that secondary tool almost always becomes counter productive. Of course, the fault here lies not in our tools but in ourselves - or at least in our managers - but it is a factor anyway.
Andy Static analysis is the fun part that lets you refactor. It's the testing code that lets you refactor with confidence.
Kay I find that code written "test-driven" comes out pretty clean. However, I don't always live up to my discipline goals, and of course there's always legacy code to deal with. So I find static analysis tools very helpful in identifying "cleanup" areas.
Kevin My second problem concerns process. I want to remember to run Reek frequently, because whenever I let it slide for a while I do find that my code deteriorates. And so I create a test that runs Reek and fails if there are any smells. But now the existence of this test pushes Reek into the 'Green' step of TDD, and that breaks the TDD micro-process. Using the tool within a test pushes refactoring upstream; no longer can I "do the simplest thing" to get to green, because there's a failing test requiring me to refactor right now. What to do?
Pat It almost seems like we need a different tool here. Testing belongs on the Red/Green border, while analysis belongs on the Green/Refactor border. (Maybe we can talk about Cucumber and similar tools being the Refactor/Red border in other post.) What would happen if there were an umbrella tool that kept us in the specify-test-analylize groove?
Andy What do you mean when you talk about the Red/Green border?
Pat Well, when I think of Red/Green/Refactor cycle, I see testing as the process that helps us move from "Red" to "Green" by letting us know what code to write and that it makes our existing tests pass. Cucumber is similar in that it lets us know which tests to write so that we can move from refactoring to to a red state before we start writing new code.
Kay I prefer to run the static analysis tools on the continuous integration server, freeing my coding time up to focus on the test-code-refactor microcycle. Although to keep me on track, I'll often run a code coverage on my new code before I check in.
I think consistent style and formatting is important too. I do a lot of Java programming, where IntelliJ does a good job of keeping formatting consistent across the project.