Leftshift’s Weblog

Techniques to improve your code

Cecil

Cecil is a library to inspect and generate programs in the CIL format. CIL is the intermediate language format generated when compiling .NET programs.

The interesting thing about Cecil is that it gives you a model of your code, in a similar way to the System.Reflection namespace in .NET. The really important bit though IMO is not documented very well. You can inspect the code and the debug symbols tradionally stored in a .pdb file. The .pdb file contains information that links the IL instruction being examined back to the source code file. It contains the filename, and start and end points within the file.

One way this information is useful is if you want to calculate the number of lines of code within a method, type or assembly. The debugger defines a set of points of interest in the code known as Sequence Points. Cecil allows you to access this information. To do so download the latest version binaries [0.6]. Three DLLs are included in the download; the main Mono.Cecil.dll plus two libraries for accessing the debug symbols. One library targets standard .NET pdbs and the other targets Mono mdbs. We’re going to use the Mono.Cecil.Pdb.dll in this example. You’ll need to reference both the main and pdb libraries.

To load an assembly, get its main module and load the associated debug symbols you’ll need to do something like this [obviously ensuring that you write the unit tests first]:

string unit = Assembly.GetExecutingAssembly().Location;
AssemblyDefinition assemblyDef = AssemblyFactory.GetAssembly(unit);
ModuleDefinition modDef = assemblyDef.MainModule;
PdbFactory pdbFactory =
new PdbFactory();
ISymbolReader reader = pdbFactory.CreateReader(modDef, unit);
modDef.LoadSymbols(reader);

The first line mearly stores the location of the current executing assembly. Change this to point to whichever assembly you may be interested in. The next two lines get the Cecil Assembly definition and the main module. The rest of the code does the interesting part. It creates a symbol reader using the PdbFactory, and the symbols are loaded by the main module using this reader. From this point on you can get at interesting stuff. An alternative to loading the symbols for a whole module is to use the ISymbolReader.Read method to load the symbols for a MethodBody.

So how can you use this? To navigate the code from the Assembly down to the IL instruction level, Cecil provides the following hierachy

AssemblyDefinition > ModuleDefinitionCollection > TypeDefinitionCollection > MethodDefinitionCollection

A MethodDefinition contains a Body property which contains a list of IL Instructions that make up the code. Each instruction has a SequencePoint property. Note that for a lot of IL Instructions this property is null. This basically means that the debugger is not interested in that instruction. This is to be expected as one line of code in C# can generate many IL instructions. Where it is not null, the sequence point contains the path to the source code file and the start / end column and line within the file. When you debug a project using visual studio it uses the same information to highlight the code in the IDE.

You can use this information to help write metrics tools [such as NDepend]

30 April 2008 Posted by | .NET, Metrics | , , | 1 Comment

Hello World

The main emphasis of this blog is to help developers improve the quality of code they create and maintain.

24 April 2008 Posted by | Code Quality | | Leave a comment