All programmers tend to agree that repeating yourself [in code] should be avoided. This is commonly known by the acronym DRY.
After looking at some demos of a search engine today a light bulb lit up in my head [not an energy effecient one I’m afraid]. The interface to the search allowed you to select from different categories and the result was dynamically updated. Nothing remarkable in that but wouldn’t it be great if we could search our code this way? And I’m not talking about the ability to find something in just your project. What if you could search the entire enterprise code based on filters for method names, return types etc right within the IDE. Every time you write a new method or class you could quickly find similar pieces of code within the enterprise. Now this may lead to some time pressured / lesser experienced developers to copy and paste more, but for me it would identify areas where it makes sense to provide libraries / services based on the actual coding needs of the developers. It would also help target problem areas across different projects. I may try and knock up a proof of concept in .NET using something like Lucene as the search engine. Not sure how this might work just yet, but it would either examine the source itself or use static analysis, something like cecil and the approach I’ve talked about before. Does anybody else think this is a good idea? What should the filters be?
Bob Martin has been thinking about adding a new element to the agile manifesto around producing quality rather than quantity. He’s described this as ‘Craftmanship over Execution’. To back this up you can follow the instructions here and measure the amount of WTF’s per minute. A great idea for a metric, but hard to automate. Maybe an idea for a new startup; provide metrics for code in the same way third party companies perform penatration testing.
As part of a continuous integration cycle most people consider running unit and integration tests. Some even consider running automated acceptance tests. Fewer still focus on code quality tests. To ensure code is maintainable requires a certain amount of effort as the code changes. I think this is what the refactor stage of the TDD red, green, refactor cycle alludes to. As well as refactoring code to remove duplication, there are other considerations to be made with regards to maintainability. We use six indicators to give a finger in the air estimate of the maintainability of a code base. The indicators we use are as follows:
Unit Test Coverage High test coverage is a good indicator of whether a TDD approach is being followed, and if not an optimistic percentage of the chance of a bug being caught. Said another way, If a bug is introduced into the code the chance of it being caught is at best the percentage of code covered by tests. This very much depends on the quality of the tests, but if you only have fifty percent coverage and introduce a bug, it’s a coin toss whether it’s detected. If the tests are poor the real figure is much lower than fifty percent.
Percentage of large methods Fairly obvious this one, but large methods are harder to maintain because they contain more code. There is more scope for error, less accuracy for identifying the cause of any error [any unit test covers more code] and a greater chance that the method is breaking the single responsibility principle giving it more than one reason to change. What you consider a large method is up to you, but we have been using ten lines of code as our measure.
Class Cohesion For a class to be cohesive all methods should use all fields. We use the lack of cohesion of methods henderson-sellers formula to measure this one. If a class isn’t cohesive it’s an indicator that unrelated functionality could be split into it’s own class. In other words it has more than one reason to change and is therefore breaking the single responsibility principle.
Package Cohesion For a package or assembly to be cohesive the classes inside the package should be strongly related. This is a measure of the average number of type relationships within a package. Low cohesion suggests that the types can be split into seperate packages.
Class Coupling This is a measure of the number of types that depend on a particular type a.k.a. afferent coupling. If a high number of types depend on the class in question, making changes to it will be hard without breaking lots of client code. There are a number of reasons why this might occur. Responsibility for one aspect may be split among multiple classes, but more likely you don’t have a losely coupled design.
Package Coupling This is a measure of the number of types outside this package or assembly that depend upon types within this package. One possible reason for high coupling is a packaging problem – things that change together should stay together. Another reason is that the packages in question have many responsibilities.
I’d love to hear feedback on the way you measure the maintability of code.
JetBrains TeamCity is a build management and continuous integration platform which supports .NET and Java. Having set up TeamCity and played with it for a couple of weeks, I’m very impressed by the slick UI and features provided out of the box. These include a set of code quality features. Even better is the fact that the professional version is free.
One of the things that really impressed me is the duplicates finder. As the name suggests it detects duplicate code and currently works with Java, C# [up to 2.0] and VB [up to 8.0]. This helps you target the areas that need refactoring.
Alongside the duplicates [Java example above] a ‘cost’ is calculated. I’m not sure of the algorithm used, but it seems fairly sensible and the cost has some relation to the amount of code that is repeated. You can use this to help prioritise your refactoring. To setup your build simply set the runner to be the duplicates finder. Continue reading
The main emphasis of this blog is to help developers improve the quality of code they create and maintain.