Thursday, December 15, 2011

How big is the core Sage library?

I just did the following with Sage-4.8.alpha5:
  1. "sudo apt-get install sloccount".
  2. "cp -rv SAGE_ROOT/devel/sage-main /tmp/x"
  3. Use a script [1] to rename all .pyx and .pxi files to .py.
  4. Ran "sloccount *" in the /tmp/x directory, which ignores autogenerated .c/.cpp files coming from Cython.

Here's the result for the full Sage library, which does not distinguish between Python and Cython. Note that sloccount really only counts lines of code -- comments are blank lines are ignored.

Totals grouped by language (dominant language first):
python:      530370 (96.41%)
ansic:        14538 (2.64%)
cpp:           5188 (0.94%)

This suggests that the core Sage library is just over a "half million lines of Python and Cython source code, not counting comments and whitespace".

Here's the breakdown by module:
SLOC    Directory       SLOC-by-Language (Sorted)
88903   rings           python=87720,cpp=1183
72913   combinat        python=71629,cpp=1284
47747   schemes         python=46255,cpp=1492
39815   graphs          python=28377,ansic=11438
31540   matrix          python=31540
31019   modular         python=31012,ansic=7
24475   libs            python=21171,ansic=2845,cpp=459
20517   misc            python=20383,ansic=134
18006   interfaces      python=18006
17577   geometry        python=16936,cpp=641
12775   categories      python=12775
12093   server          python=12093
11971   groups          python=11971
11961   plot            python=11961
10686   crypto          python=10686
9920    modules         python=9920
8389    symbolic        python=8260,cpp=129
8150    algebras        python=8150
7260    ext             python=7198,ansic=62
7093    structure       python=7093
6364    coding          python=6364
5670    functions       python=5670
5249    homology        python=5249
4798    numerical       python=4798
4323    quadratic_forms python=4323
3919    gsl             python=3919
3911    calculus        python=3911
3879    sandpiles       python=3879
3003    sets            python=3003
2647    databases       python=2647
2074    logic           python=2074
1736    finance         python=1736
1608    games           python=1608
1465    monoids         python=1465
1435    tests           python=1383,ansic=52
1370    stats           python=1370
971     interacts       python=971
959     tensor          python=959
906     lfunctions      python=906
308     parallel        python=308
275     probability     python=275
219     media           python=219
197     top_dir         python=197

Here is the script [1]:
#!/usr/bin/env python

import os, shutil

for dirpath, dirnames, filenames in os.walk('.'):
    for f in filenames:
        if f.endswith('.pyx') or f.endswith('.pxi'):
            print f
            shutil.move(os.path.join(dirpath, f),
                        os.path.join(dirpath, os.path.splitext(f)[0] + '.py'))