Metrics - LoC
This is going to be a small set of articles devoted to metrics. The first one is about LoC - Line of Code.
I think that the first reaction on that phrase is smile. Many of us have heard about an anecdotal case at Macintosh when they tried to count the lines of code and measure productivity. There is even a classic saying of Kenneth Thompson: "One of my most productive days was throwing away 1000 lines of code".
Actually, counting SLOC is routine and boring: we simply carefully count the lines one by one including code lines itself, comments and blank lines. But if there are more than 25% of blank lines then you can not count the lines which are over 25%.
Obviously, immediate improvement of the process is a method which allows counting not physical lines only, but also is able to separate, somehow, the lines of code according to their meanings. For example:
// Instructions alloca and PHI will be skipped, define helper indicator
unsigned ind = 1;
Here we have two physical lines. However, if we distinguish these lines by their logic meanings then we have one line of code and one line of comments. The questions are: Should we consider everything as a code? Should we throw the second line out and if “yes”, then why? The comments explain something, which is not obvious from the code. How to distinguish which one is useful and which one is useless? Formal criteria can give answers for the most trivial cases only. For example, comments, which were generated by IDE wizard, can pass through the filter easily. For more complicated cases only AI can be helpful. Plus a code that may be commented... It happens very often. Having fun already?
Let’s take another example:
if(ctx && (ctx->flags==CTX_TEST)) return AUTHORITY_KEYID_new();
Here we have one physical line and two logic ones. Obviously, to divide the line according to its logic meanings we have to write a parser of the given programming language. It worth to mention, that the process of writing the parser will not be trivial because you can not count on one or another programming style.
So let us summarise now.
Advantages
- The Physical SLOC counting is very easy to implement.
- Physical SLOC, practically, does not depend on the used language.
- Intuitive metric. A code line is what you can see and imagine. Do you need to imagine a program of 1000 Physical SLOC? No problem!
- The use of Logical LOC may have a very nice side effect as an extra QA-control. Here, there are a few parameters which can make files readability easier. For instance, the LLOC recommended amount for a function is <= 500 and for a class is <= 1500. That is why if an individual writes a function of 1000 LLOC then automatic QA-control will not pass it to a repository.
Disadvantages
- SLOC does not reflect the program functional. The same code can be programmed with the different amount of SLOC which, in its turn, depends on many factors: programmer qualification, used libraries, etc.
- SLOC does not represent how complex a program is. Even the only code line could be a result of many sleepless nights, or open source analysis, or Internet surfing, or whatever else – a lot of variants.
- Logical SLOC greatly depends on a language which is used. A parser of one language can hardly be used for another language. Writing of a partial syntax analyzer for many languages is not a trivial task.
- Some logical essences can not be separated by “useful” and “useless”.
- In modern programming environments, wizards that generate a code are widely spread. From the syntax point of view the code is not different from the ordinary one but it has less physical sense and actually can not be countable. For parsing by logical SLOC method, a layer of logic has to be added to distinguish the auto-generated files from the self-written ones.
- The SLOC amount significantly varies and depends on the language you use. The same program will have absolutely different amount of SLOC depending on the language. A conclusion is: any analytical data got on the basis of any manipulations with SLOC for certain programming language can not be extrapolated for any other one.
My opinion: obviously, LOC advantages look pale on the bright background of disadvantages. I do not understand why such an advanced organisation as NASA still uses this poor system. We do not use it.
- Posted by: volodya 11.5.2009 at 05:43 0 comments
You're currently an anonymous user. Just browsing around? That's totally cool with us. We won't bug you until you're ready to write a comment. Otherwize you have to enter your OpenID credentials to log in. If you have not one, you can easily create it!
Example OpenIDs:
- http://openid.aol.com/yourname
- http://yourname.myopenid.com/
- https://me.yahoo.com/yourname (alternately, http://yahoo.com/ works too)
- http://claimid.com/yourname
- http://yourname.wordpress.com/
- http://yourname.blogspot.com/
- http://technorati.com/people/technorati/yourname
- http://yourname.pip.verisignlabs.com/
- http://yourname.livejournal.com/
- http://www.flickr.com/photos/yourname
WHAT'S NEW
- PEP8 validation script
- Modified PEP8 validation script with Nesting Depth additional validation.
- October 21, 2010
- PEP8 and nesting depth metric
- Company code style is one of the most essential policies to follow for any programming-related IT-organization. It helps to organize interaction between developers, especially for Agile teams, makes code more ...
- October 21, 2010
- CodeExample plugin for Trac
- The Trac plugin for code examples colouring. It supports three types of examples - a simple, a correct one and an incorrect. Further details see at
- September 29, 2010
- A couple of words about TDD
- Unit-test coding supposes to be one of the most significant methodological achievements of the industry, let’s say, for about last 15 years. The Internet is full of enthusiastic exclamations [1, ...
- February 21, 2010
- Metrics - LoC
- This is going to be a small set of articles devoted to metrics. The first one is about LoC - Line of Code. I think that the first reaction on ...
- May 11, 2009