Tuesday, October 26, 2010

Sage as a Python Library?

When I started Sage I viewed it as a distribution of a bunch of math software, and Python as just the interpreter language I happen to use at the time. I didn't even know if using Python as the language would last. However, it's also possible to think of Sage as a Python library.

Anyway, it has occurred to me (a few times, and again recently) that it would be possible to make much of the Sage distribution, without Python of course, into a Python library. What I mean is the following. You would have a big Python library called "sagemath", say, and inside of that would be a huge HG repository. In that repository, one would check in the source code for many of the standard Sage spkg's... e.g., GAP, Pari, etc. When you type

python setup.py install

then GAP, Pari, etc., would all get built, controlled by some Python scripts, then installed as package_data in the sagemath directory of /site-packages/.

From a technical perspective, I don't see any reason why this couldn't be made to work. HG can handle this much data, and "python setup.py install" can do anything. It does lead to a very different way of looking at Sage though, and it could help untangle things in interesting ways.

(1) Have a Python library called "sagecore", which is just the most important standard spkg's (e.g., Singular, PARI, etc.), perhaps eventually built *only* as shared object libraries (no standalone interpreters).

(2) Have a Python library which is the current Sage library (we already have this), and which can be installed assuming sagecore is installed.

(3) Have other Python libraries (like psage: http://code.google.com/p/purplesage/source/browse/), which depend on (2). Maybe a lot of the "sage-combinat" code could also be moved to such a library, so they can escape the "combinat patch queue" madness. Maybe many other research groups in algebraic topology, differential geometry, special functions, etc., will start developing such libraries... on their own, and share them with the community (but without having to deal directly with the sage project until they want to).

To emphasize (3), when people want to write a lot of mathematics code in some area, e.g., differential geometry, they would just make a new library that depends on Sage (the library in (2)). We do the work needed to make it easy for people to write code outside of the Sage library, which depends on Sage. Especially writing Cython code like this can be difficult and confusing, and we don't explain it all in any Sage documentation. It actually took me quite a while to figure out how to do it today (with psage).

The core Sage library (2) above would continue to have a higher and higher level of code review, tough referee process etc. However, the development models for (3) would be up to the authors of those libraries.

The above is already how the ecosystem with Python (http://pypi.python.org/pypi), Perl (http://www.cpan.org/), R, etc., work. Fortunately, Python has reasonably good support already for this.

I think without a shift in this direction, Sage is going to be very frustrating for people writing research oriented code.

Fortunately, it's possible to do everything I'm describing above without disturbing the mainline Sage project itself, at least for now.


  1. Especially given the popular and powerful programs you mentioned near the end of your post, I think this great step in modularizing the Sage infrastructure and, thus, simplifying the installation, usage, and development processes.

    I admit that I was worried when you announced psage. Is this new package a reflection of how complicated Sage has become? Will it become a fork in the future and, when it gets too big itself, will psage spawn a fork of its own?

    With the structure you describe in your post it seems like a project like psage has a more "natural" setting in the Sage ecosystem and I agree that it would seem easier for one to develop a new Sage "package".

    I'm interested to see where your ideas for Sage go.

  2. This sounds like a great idea! I think that it takes sage in a really interesting direction - modularity is always good.

  3. Very exciting.

    I keep complaining how it is really hard to use state of the art research that is outside ones field of interest. Having all the power of sage at an import away would certainly make it much easier for many scientists bound to other environments to leverage state of the art mathematical algorithms.

  4. Another advantage is that the librarized code starts being usable from "normal" python apps (at least those which run without very tight memory constraints).

    Still, not an easy project...

  5. Turning sage or parts of sage into a python module is an exciting idea. It would make it much easier to create sage-powered webapps using your web framework of choice, be it Django, web2py, Flask or something else.

  6. The very fact that it is not package-like has been a bit let down for me.

    It would open up a lot of possibilities if one could use it like any other python package to supplement own programs.

    Also, would this kind of approach alleviate the pains for Windows port? Then again I suppose the "package sage" would have the same pre-compiling requirements that makes the native Windows port a near impossibility.