As part of a continuous integration cycle most people consider running unit and integration tests. Some even consider running automated acceptance tests. Fewer still focus on code quality tests. To ensure code is maintainable requires a certain amount of effort as the code changes. I think this is what the refactor stage of the TDD red, green, refactor cycle alludes to. As well as refactoring code to remove duplication, there are other considerations to be made with regards to maintainability. We use six indicators to give a finger in the air estimate of the maintainability of a code base. The indicators we use are as follows:
Unit Test Coverage High test coverage is a good indicator of whether a TDD approach is being followed, and if not an optimistic percentage of the chance of a bug being caught. Said another way, If a bug is introduced into the code the chance of it being caught is at best the percentage of code covered by tests. This very much depends on the quality of the tests, but if you only have fifty percent coverage and introduce a bug, it’s a coin toss whether it’s detected. If the tests are poor the real figure is much lower than fifty percent.
Percentage of large methods Fairly obvious this one, but large methods are harder to maintain because they contain more code. There is more scope for error, less accuracy for identifying the cause of any error [any unit test covers more code] and a greater chance that the method is breaking the single responsibility principle giving it more than one reason to change. What you consider a large method is up to you, but we have been using ten lines of code as our measure.
Class Cohesion For a class to be cohesive all methods should use all fields. We use the lack of cohesion of methods henderson-sellers formula to measure this one. If a class isn’t cohesive it’s an indicator that unrelated functionality could be split into it’s own class. In other words it has more than one reason to change and is therefore breaking the single responsibility principle.
Package Cohesion For a package or assembly to be cohesive the classes inside the package should be strongly related. This is a measure of the average number of type relationships within a package. Low cohesion suggests that the types can be split into seperate packages.
Class Coupling This is a measure of the number of types that depend on a particular type a.k.a. afferent coupling. If a high number of types depend on the class in question, making changes to it will be hard without breaking lots of client code. There are a number of reasons why this might occur. Responsibility for one aspect may be split among multiple classes, but more likely you don’t have a losely coupled design.
Package Coupling This is a measure of the number of types outside this package or assembly that depend upon types within this package. One possible reason for high coupling is a packaging problem – things that change together should stay together. Another reason is that the packages in question have many responsibilities.
I’d love to hear feedback on the way you measure the maintability of code.
Using relative complexity in your estimation is quite an effective estimating technique. To do this you would normally associate a complexity score to each feature or story. The scale a lot of people used is based on the fibonnaci sequence where next n=previous (previous (n)) + previous (n). You start with 0 and 1 to obtain the following sequnce 1,2,3,5,8,13 etc. You assign points based on the relative complexity where a 2 point story would be roughly twice as complicated as a 1 point story. The power of this approach comes from the comparison – we seem to be naturally better at this than assigning absolute values.
If you take this approach it is very useful to measure how long each story took to implement. With this information in hand you can draw a picture similar to the following
On the top side of the diagram you have the complexity points, scaled appropriately. On the bottom side of the diagram you have implementation time again scaled appropriately. If you plot all of the features on the bottom in the correct position with regards to the time to implement them and draw a straight line back to the complexity point estimate [I know my picture above doesn’t use straight lines, I’ll update the post with a better example] you’ll end up with something along the lines of the diagram above.
In an ideal world all of the features that were estimated as 1 complexity point will have taken you less time to develop than those you estimated as 2 complexity points and so on. Where the lines tangle [cross each other] suggest where your estimation could be improved. Your retrospective can be used to address such issues.