Sunday, February 11, 2007

Update: % of "GPL 2 only" in Linux

It has been brought to my attention that something similar to the test I reported on in my last post has already been carried out. Turns out that Linus himself did such a thing (scroll down to the bottom).

Comparing the two, we see that Linus estimates 33% of the kernel being "GPL 2 or above", while I estimated 40%. It is nice that the two estimates are fairly close; I think we can assume that they are not far off from the truth.

Linus' estimate is smaller, however, probably because of his methodology:
[torvalds@g5 linux]$ git-ls-files '*.c' | wc -l
7978
[torvalds@g5 linux]$ git grep -l "any later version" '*.c' | wc -l
2720
As you can see, Linus probably wasted around 5 seconds of his time doing this, while I on the other hand wasted around 2-3 hours. However, mine is more accurate, since, for example, I take into consideration some possibilities like "any later version" being cut off at a line break, for example (or at least some ways in which that can occur). This may explain why Linus' estimate is smaller; he misses a few. But not very much, it seems.

Another difference is that Linus counts the number of files, while I sum the file sizes. You can argue either way which is more 'correct' (I should probably have calculated both, but having already wasted some 20 times more of my life than Linus, I think I've wasted enough already).

More important is that I report on other categories, like the files that say only "GPL" but don't specify a version, which is interesting information, I think.

Anyhow, it was interesting to see this.

3 Comments:

At 9:22 AM , Blogger Jack said...

Dude, can you please release youre code for this for others who are interested in looking at your methodology.

 
At 12:06 PM , Blogger kripken said...

jack,

ok, I added a link in the previous post to the source code.

 
At 5:43 PM , Blogger butlimous said...

Thanks for the nice post!

Free PS3

 

Post a Comment

Subscribe to Post Comments [Atom]

<< Home