STATS

From jmips

Contents


Project metrics summary

Project analysis by [cccc] Feb 29 2012

Metric Tag Overall Per Class
Number of classes (modules) NOM 58
Lines of Code LOC 3636 62.690
McCabe's Cyclomatic Number MVG 728 12.552
Lines of Comment COM 2372 40.897
LOC/COM L_C 1.533
MVG/COM M_C 0.307
Information Flow measure (inclusive) IF4 1909 32.914
Information Flow measure (visible) IF4v 1909 32.914
Information Flow measure (concrete) IF4c 1 0.017

The following explanatory rubric is largely reproduced as-is from the cccc output:

NOM = Number of classes
Number of non-trivial classes identified by the analysis.
LOC = Lines of Code
Number of non-blank, non-comment lines of source code counted by the analysis.
COM = Lines of Comments
Number of lines of comment identified by the analysis.
MVG = McCabe's Cyclomatic Complexity
A measure of the decision complexity of the functions which make up the program.The strict definition of this measure is that it is the number of linearly independent routes through a directed acyclic graph which maps the flow of control of a subprogram. The analysis counts this by recording the number of distinct decision outcomes contained within each function, which yields a good approximation to the formally defined version of the measure.
L_C = Lines of code per line of comment
Indicates density of comments with respect to textual size of program
M_C = Cyclomatic Complexity per line of comment
Indicates density of comments with respect to logical complexity of program
IF4 = Information Flow measure
Measure of information flow between classes as suggested by Henry and Kafura. The analysis makes an approximate count of this by counting inter-module couplings identified in the module interfaces.

Commentary

The measures overall are very good, showing a high comment to code ratio (2 comments for every 3 lines of code) and a small number of lines (68) per class, which is short enough to fit nearly all of a class on a standard page (53 lines) so certainly individual methods will fit on a page, on average. Bearing in mind that there are certainly some very long classes and methods, however, these averages are intrinsically misleading because of the variance. There must also be some very short methods and classes!

The average displayed here weights a class by its number of lines, so one class of 100 lines and four classes of 5 lines each would work out to an average of 24 lines per class, which doesn't really capture the idea of "most classes are very short, and one is way too long".

The complexity measures the number of alternate paths through a program. In general every different path deserves a comment, so the measure says that every path gets three lines of due comment. That is very reasonable, but it is clear from that alone if the comments are in the right place. The comments might be all block-header, for example, with nothing commenting sites of difficulty in situ.

The information flow measures are unusually high, reflecting poor separations. That is to be expected, since the model models a real physical situation in which everything is connected!

COCOMO estimate

The COCOMO estimate is as follows based on 3.7KLOC, complexity C=3.6, S=1.2 for systems programming code (high complexity), and a factor M consisting of a robustness factor of 1.5 (high reliability) times a factor of 1.0 for an "adequate development environment" and a team rating of 4 (experienced engineer!) resulting in an advantage factor of 0.33 (small project), giving

 M = 1.5 * 1.0 * 0.33 = 0.5

and producing an effort estimate

 SEM = C * KLOC^S * M = 3.6 * 3.7^1.2 * 0.5 = 8.65

Thus the project is estimated by COCOMO at having required 8.65 months of software engineering effort. At say, a seat-cost of 50000 GBP/annum, that is a worth of 36000 GBP.

However, the estimate is highly sensitive to factors such as the working tools .. If one were to suppose that the development environment were in fact excellent, not adequate, then M would be halved (replace "1.0" by "0.5") and the cost estimate would be halved:

 SEM = 4.325

So the author should have asked for 18000 pounds to break even. Assuming a profit margin of 50% (or deadline bonus ..) the asking price should have been 27000 pounds.

In fact, the first code entry (Debug.java!) for jMIPS 1.0 appears to have been February 24 2010, and there are entries going up to May 29 2010 (CPU5.java) for it. That is three months. At that time the project consisted of 5218 LOC and 3443 lines of comments. So the code has reduced since (by dint of hard work! Duplications have been removed).

It would have been completed 1.3 months ahead of time if the COCOMO estimate had been used in the bidding. The "seat costs" are the costs of keeping authors alive (paying salary) plus running overheads, which include things like office rent and power. 50000 pounds per annum is on the low side for that nowadays.

We may repeat the calculation taking into account the older figures for lines of code, supposing that at that time the code was of lesser quality, with a robustness factor of "1.0" instead of "1.5", and that gives for 5.2KLOC:

 M = 1.0 * 0.5 * 0.33 = 0.166

and

 SEM = C * KLOC^S * M = 3.6 * 5.2^1.2 * 0.166 = 4.32

By May 16 2010, all the project documentation (simulator[1-5].html, README, etc) had also been written. It looks to have been as complete as it should have been.

So the estimate is the same whichever way one cuts it. It should have taken something more than four months to develop, and it was developed in three. If it had been costed according to COCOMO, a profit would have been made.


Measures per class

This is the analysis per class. The analysis software incorrectly included certain system classes (PrintStream, Exception, RandomAccessFile, etc) which have been left out here but which may have skewed the analysis figures overall.

It is also clear that there are other miscounts. "Read" is measured at zero lines! But then there are multiple interior classes named "Read", one in each "Pipe", of which there are also multiples, one for each CPU. The analysis software has become confused. I have removed the "zero lines" classes from the table, for clarity.

Certain classes which had no real existence (e.g.: "Shdr[]") were also included by the analysis, and I have removed those from the table.

You may wish to redo the analysis one model at a time.

Module Name LOC MVG COM L_C M_C
ALU 193 86 44 4.386 1.955
AluReturnT 14 0 23 ------ ------
CPU1 275 65 225 1.222 0.289
CPU2 26 0 25 1.040 ------
CPU3 29 0 26 1.115 ------
CPU4 29 0 28 1.036 ------
CPU5 91 16 46 1.978 0.348
Cache 113 22 106 1.066 0.208
Clock 95 16 98 0.969 0.163
Cmdline 124 44 50 2.480 0.880
Console 41 6 21 1.952 0.286
Console5 45 6 20 2.250 0.300
Cpu1 69 9 47 1.468 0.191
Cpu2 69 9 45 1.533 0.200
Cpu3 69 9 45 1.533 0.200
Cpu4 72 9 47 1.532 0.191
Cpu5 79 9 48 1.646 0.187
Debug 528 228 230 2.296 0.991
DecodeReturnT 41 0 54 0.759 ------
Depend 87 37 32 2.719 1.156
DependReturnT 3 0 8 ------ ------
Elf 384 79 136 2.824 0.581
Globals 258 0 179 1.441 ------
IOBus 1 0 3 ------ ------
IOUnit 2 0 12 ------ ------
InstructionRegister 140 0 58 2.414 ------
Keyboard 63 7 49 1.286 0.143
Memory 2 0 15 ------ ------
OptionT 13 0 13 ------ ------
Page 5 0 17 ------ ------
Phdr 23 1 22 1.045 ------
Pipe 53 0 84 0.631 ------
Port 8 0 5 ------ ------
RegReturnT 12 0 17 ------ ------
Region 23 2 10 2.300 ------
Register 23 4 25 0.920 ------
RegisterUnit 72 9 77 0.935 0.117
Screen 58 5 67 0.866 0.075
Shdr 30 1 43 0.698 ------
SignExtend 6 0 23 ------ ------
StopClock 9 0 15 ------ ------
Utility 197 19 176 1.119 0.108


The analysis marks the Debug class as too long and too complex! That is not an issue - it does not affect the functioning of the software. The complexity is in its printout routines. On the other hand, one of the few bugs reported traces back to incorrect Debug class use, so perhaps the analysis is correct!

Delta source analysis

The following graph shows numbers of lines changed in the experimental (v1.8) branch over a one-month period.

Image:Changes.png

Contributor "ptb" has about 5000 lines of changes in 600 hours, or about 8.4 lines changed per hour, which works out as 200 lines of changes every day. These are mostly not new lines, however, but fixes for old lines.

Contributor "fse6969" makes about 1 line change per hour, or 24 lines changed per day. This is about the expected level for continuous (perfective) maintenance activity.

Contributions from other contributors have been sporadic in the period. A new file contribution from "fse0724" is good development activity, but not followed up with maintenance engineering beyond an initial first adjustment.

A similar graph for contributions to the project wiki is shown below:

Image:Wikichanges.png