Wednesday, May 27, 2009

Questions Five Ways - Static Code Analysis and Testing

This week in Questions Five Ways I've assembled a group of testers and tool builders to talk about code analysis and code testing. Kevin Rutherford (@kevinrutherford) is the co-author of the upcoming Ruby Refactoring Workbook and the creator of reek. Andy Lester (@petdance) is a longtime proponent of testing in the Perl community. Kay Johansen (@utahkay) is an agile testing guru. Russ Olsen (@russolsen) is the author of Design Patterns in Ruby. I hope you enjoy the discussion, but please take a moment to add your thoughts in the comments.

What is the right interplay between testing code (unit test and the like) and static code analysis (lint, Reek, and their ilk)?

Kevin Tools such as Reek[1], lint, flay are intended to be helpers for the "refactor" step of TDD. Not everyone is good at finding the smells in their code, and I certainly find that I detect fewer smells in code that I'm familiar with; so having a tool to help my sense of smell can be quite beneficial.

But there are a couple of clear problems with using tools here, and so I'll start off our discussion by trying to phrase them clearly enough for group discussion...

Firstly, tools such as Reek attempt to make a subjective concept (code maintainability) into an objective one. In any codebase there will be code smells we're happy to tolerate — at least in the short term — and yet the tool will continue to nag us about them. (This is why Reek provides a config mechanism that allow certain code elements to be ignored.) On the plus side, Reek sees smells that I miss; and fixing them usually improves my code. But on the minus side, Reek can be more pedantic than I would be, and sometimes I have to spend time doing refactoring just to keep Reek happy. So, is it worth the effort to keep code quality tools quiet?

Andy Not everyone is familiar with these tools, in the Perl world we use perlcritic. Reek and perlcritic are very similar, and yet so very different. Here's what perlcritic tells me about some code (PBP stands for Perl Best Practices):

alester:~/ack : make critic
perlcritic -1 -q -profile perlcriticrc ack-base Use character classes for literal metachars instead of escapes at line 919, column 15.  See page 247 of PBP.  (Severity: 1) Null statement (stray semicolon) at line 328, column 14.  (no explanation).  (Severity: 3) Subroutine with high complexity score (31) at line 183, column 1.  Consider refactoring.  (Severity: 3) Subroutine does not end with "return" at line 21, column 1.  See page 197 of PBP.  (Severity: 4) Ambiguously named subroutine "close" at line 45, column 1.  See page 48 of PBP.  (Severity: 3) Subroutine name is a homonym for builtin function at line 45, column 1.  See page 177 of PBP.  (Severity: 4)

Kevin Good point Andy. Reek's output looks like this:

"samples/optparse.rb" -- 116 warnings:
OptionParser has at least 59 methods (Large Class)
OptionParser#CompletingHash#match/block/block is nested (Nested Iterators)
OptionParser#Completion::complete calls candidates.size multiple times (Duplication)
OptionParser#Completion::complete calls k.id2name multiple times (Duplication)
OptionParser#Completion::complete has approx 23 statements (Long Method)
OptionParser#Completion::complete has the variable name 'k' (Uncommunicative Name)
OptionParser#Completion::complete has the variable name 'v' (Uncommunicative Name)
OptionParser#Completion::complete refers to candidates more than self (Feature Envy)
OptionParser#Switch#RequiredArgument#parse is controlled by argument arg (Control Couple)
OptionParser#Switch#initialize has 7 parameters (Long Parameter List)

Pat Let's go back to Kevin's question, it's a tough one. I like seeing all of the warnings from a tool like Reek. It makes me question how or why I've written something. On the other hand, having seen it once and having made a decision about it, continuing to see it becomes annoying. I hadn't known about configuring it to ignore code elements, thanks for pointing it out. Maybe it would be less annoying if tools like this that kept some metadata and separated new warnings from the ones you've seen before.

Andy It can be harder to manage the results of static analysis, because they're rarely binary, unlike unit tests which are pass/fail. With static analysis results, I like to track trends over time, since I rarely ever have a clean run of Perl::Critic or splint.

Static analysis tools are also hard to manage becasue their results can often lead to arguments, if only with yourself! Which automated exhortations do we need to follow, and which can we ignore? Is it OK to have this long loop in just this one case? And when you decide to ignore a setting, do you annotate the source code? Source code annotations are subject to the same bit rot as source code itself.

All that said, static analysis is very different with C, where lint, splint and gcc warnings may tell you about actual bugs where memory may get corrupted, not just stylistic improvements.

Russ The thing that you have to keep in mind is that the code analysis tools are optional helpers: They look into your code and tell you things that might be wrong. They might help, they might not and you can take them or leave them. Tests, on the other hand, are life and death: If you don't have tests or good tests or if your code isn't passing the tests that you do have, then you have no idea whether your system works or not. This is true no matter if you are using Java or C#, but it is particularly true when you are working with dynamically typed languages like Ruby.

Having said that, I think that one good thing that static analysis of code can do that is a real help is to enforce a uniform style on your code, so that everyone is coding more or less the same way. In the long term that has real value.

There is also an organizational aspect to this issue. I've worked for a lot of big, bureaucratic organizations, and the mindset that develops in those kind of situations is that anything that is not required should be forbidden. So you tend to get mandates: Your code shall have test coverage; You shall run this or that code analysis tool. Now it's hard to argue with requirements for test coverage, but mandating the use of this or that secondary tool almost always becomes counter productive. Of course, the fault here lies not in our tools but in ourselves - or at least in our managers - but it is a factor anyway.

Andy Static analysis is the fun part that lets you refactor. It's the testing code that lets you refactor with confidence.

Kay I find that code written "test-driven" comes out pretty clean. However, I don't always live up to my discipline goals, and of course there's always legacy code to deal with. So I find static analysis tools very helpful in identifying "cleanup" areas.

Kevin My second problem concerns process. I want to remember to run Reek frequently, because whenever I let it slide for a while I do find that my code deteriorates. And so I create a test that runs Reek and fails if there are any smells. But now the existence of this test pushes Reek into the 'Green' step of TDD, and that breaks the TDD micro-process. Using the tool within a test pushes refactoring upstream; no longer can I "do the simplest thing" to get to green, because there's a failing test requiring me to refactor right now. What to do?

Pat It almost seems like we need a different tool here. Testing belongs on the Red/Green border, while analysis belongs on the Green/Refactor border. (Maybe we can talk about Cucumber[2] and similar tools being the Refactor/Red border in other post.) What would happen if there were an umbrella tool that kept us in the specify-test-analylize groove?

Andy What do you mean when you talk about the Red/Green border?

Pat Well, when I think of Red/Green/Refactor cycle, I see testing as the process that helps us move from "Red" to "Green" by letting us know what code to write and that it makes our existing tests pass. Cucumber is similar in that it lets us know which tests to write so that we can move from refactoring to to a red state before we start writing new code.

Kay I prefer to run the static analysis tools on the continuous integration server, freeing my coding time up to focus on the test-code-refactor microcycle. Although to keep me on track, I'll often run a code coverage on my new code before I check in.

I think consistent style and formatting is important too. I do a lot of Java programming, where IntelliJ does a good job of keeping formatting consistent across the project.

1 You can read more about reek in my interview with Kevin.

2 Ben Mabey gave a great presentation on cucumber at the 2009 MountainWest RubyConf.

Click here to Tweet this article


matt harrison said...

Interesting comments. I think both refactoring and testing benefit from an understanding of code complexity, but no one mentioned it. It sort of ties everything together since it allows you to know when to refactor something as well as gives an indication of testing completeness. Sadly the coverage tools to provide branch/path coverage don't seem to exist in the dynamic language world. (I'm more concerned with Python, but last time I checked they were lacking in perl/ruby).

Anyway, I gave talk at pycon a while back discussing this. Perl/ruby folks don't be afraid, though it was done at a Python conference the ideas apply everywhere.

Re: where linting tools should reside. I've got emacs running pyflakes (a linting tool) using flymake-mode and outlining issues. I'm not sure a pass/fail test is the best place to put this because as you mentioned some of it is subjective.

gnupate said...

Hi Matt,
Thanks for the comment, and the link.

I'd love to hear more about pyflakes and flymake-mode if you're interested in sharing.

marcoow said...

Interesting discussion. I played with code analysis for quite some time now (and for the most part with c# where static analysis is completely different of course since it's a static language) and build excellent in the last couple of weeks. It combines concepts and checks of roodi, reek and flog and also adds Rails specific checks. I think especially those Rails specific checks can be really helpful since they don't check general language patterns but concrete framework specific rules.

I would be interested about hearing what you think about it!


gnupate said...

excellent sounds really interesting. How modular is it? Could it be used with non-rails frameworks too?

marcoow said...

@pate: Excellent does not directly relate to Rails at all. It just performs the Rails related checks if the Code looks like Rails Code (class derives from ActiveRecord::Base or file is named like a Rails partial). In general, Excellent checks any Ruby Code you might have. Check the documentation (especially that for SexpContext and Warnings::Base) and the source for help on how to build your own checks.


chorny said...

matt harrison: Perl's Devel::Cover exists from 2001 and is very popular. See for example these presentations (don't know anything about their quality): Testing Code and Assuring Quality, Test Driven Development Tutorial and other presentations in Perl programming group on Slideshare.

gnupate said...

I haven't been too involved in the Perl world since about 2000. I'm always interested in learning about the Devel tools that exist there though. So may of them bring great insights that can be pulled into other languages. Thanks for the pointers.

gnupate said...

cool. It will be interesting to see what kind of community grows around it, and what kinds checks are written.

Have you considered building a check repository/library?

marcoow said...

@pate: Excellent includes about 15 checks by now and I'm definitely planning on extending that number. I absolutely would appreciate suggestions for new checks to write and include in the gem!

Kevin Rutherford said...

Since this discussion I've (with the help of Marty Andrews) begun a spike implementation of the CRAP metric for Ruby code; watch for progress in the coming weeks.

gnupate said...

Very cool. Do you have a mailing list/google group set up? It seems like that would be the ideal way to get discussion flowing about excellent.

Nice, between ABC and Cyclometric complexity checking Ruby's starting to get a good toolkit built.

marcoow said...

I just created the excellent group. I hope lots of people will join and come up with interesting suggestions/ ideas and criticue.

marcoow said...

Looks like the link got lost in my last post: