Sage: Open Source Mathematics Software

Sunday, November 23, 2008

Magma and Sage

I spent the weekend working on making Sage and Magma talk to each other more robustly. Getting different math software systems to talk to each other is a problem that the OpenMath project tried to tackle since the 1990s, but they failed. Sage has out of necessity made real (rather than theoretical) progress toward this problem over the years, and what I did this weekend was a little step in the right direction for Sage.

First, I designed with Michael Abshoff a new feature for our testing framework, so we can test only optional doctests that depend on a certain component or program being present. Without a usable, efficient, and flexible testing system it is impossible to develop good code, so we had to do this. Next, I worked on fixing the numerous issues with the current Sage/Magma interface, as evidenced by many existing doctests failing. It was amusing because some of the doctests had clearly never ever succeeded, e.g., things like


 sage: magma.eval('2')       # optional
 sage: other stuff

was in the tree, where the output was simply missing.

Anyway, in fixing some of the much more interesting issues, for example, things like this that involve nested polynomial rings, I guess I came to understand better some of the subtleties of getting math software to talk with other math software.


sage: R.<x,y> = QQ[]; S.<z,w> = R[]; magma(x+z)
boom

The first important point is that one often thinks that the problem with interfacing between systems is given an object X in system (say Sage), finding a string s such that s evaluates in another system (say Magma) to something that "means the same thing" as X. This is the problem that OpenMath attempt to solve (via XML and content dictionaries), but it is not quite the right problem. Instead, given a particular mathematical software system (e.g., Magma) in a particular state, and a view of that state by another system (e.g, Sage), the problem is to then come up with a string that evaluates to the "twin image" of X in the running Magma system.

To do this right involves careful use of caching. Unless X is an atomic element (e.g., a simple thing like an integer) it's important to cache the version of X in Magma as an attribute of X itself. Let's take an example where this caching is very important and subtle. Consider our example above, which has the following Sage code as setup.


sage: R.<x,y> = QQ[]
sage: S.<z,w> = R[]

This creates the nested polynomial ring (QQ[x,y])[z,w]. The new code in sage-3.2.1 (see #4601) does the following to convert x + z to a particular Magma session. Note that the steps to convert x+z to Magma depend on the state of a particular Magma session! Anyway, Sage first gets the Magma version of S, then askes for the generator names of that object in the given Magma session. These are definitely not z,w:


sage: m = magma(S)
sage: m.gen_names()
('_sage_[4]', '_sage_[5]')

The key point is the strings returned by the gen_names command are strings that are valid in Magma and evaluate to each of the generators we're after. They depend on time -- if you did stuff in the interface earlier you would get back different numbers (not 4 and 5). Note that it's very important that the Python objects in Sage that _sage_[4] and _sage_[5] point to do not get garbage collected, since if they do then _sage_[4] and _sage_[5] also become invalid, which is not good. So it's important that the magma version (m above) of S is cached.

Next Sage gets the magma string version of each of the coefficients of the polynomial x+z (over the base ring R) using a similar process. It all works very well without memory leaks, but only because of careful track of state and caching.
And the resulting string expression involves the _sage_[...]'s.


sage: (x+z)._magma_init_(magma)
'((1/1)*1)*_sage_[4]+((1/1)*_sage_[7])*1'
sage: magma(x+z)
z + x

Notice that _magma_init_ -- the function that produces the string that evaluates to something equal to x+z in magma -- now takes as input a particular Magma session (there can be dozens of these in a given Sage session, with different Magma's running on different computers all over the world). This is a change to _magma_init_ that makes the implementation of what's described above easy. It's an API change that might also be carried over to many of the other interfaces (?).

Thursday, November 20, 2008

Sage-3.2 and Mathematica 7.0

We just released sage-3.2! W00t! See for the tickets closed in this release.

There's been a lot of hyperbole due to Mathematica 7.0's recent release. A colleague of mine got a personal email from Stephen Wolfram himself, asking him to try out Mathematica 7.0, and instead my colleague forwarded the message to me and remarked that it was too late, since he had switched to Sage.

I looked over the Mathematica 7.0 release notes... and noticed that they added support for computing with Dirichlet characters. I implemented the code in Magma and Sage, and wrote a chapter in my modular forms book about computing with Dirichlet characters. So I followed the "what's new" to this Mathematica page about their new functionality for Dirichlet characters. It's sad. They give no way of specifying a character, except to give the "ith character", which is meaningless and random (and they say so) -- that's like giving a matrix over a finite field at random. All they give is a function to evaluate characters at numbers -- they don't give functions for arithmetic with them, or computing invariant such as the conductor, which is where all the real fun comes in. Boggle. Sage is light years ahead of Mathematica here.

The Mathematica release notes also brag about finally having something for finite groups, but again it is very minimal compared to what Sage provides (via GAP). Basically all they have is a bunch of tables of groups, but no real algorithms or functionality. The whole approach seems all backwards -- first one should implement algorithms for computing with groups, then use them to make tables in order to really robustify the algorithms, then compare those tables to existing tables, etc. I wonder whether the group theory data in Mathematica was computed using Gap or Magma?

Wednesday, November 12, 2008

I'm back from Sage Days 11 (UT Austin), which was very intense as are all Sage Days. It was a great workshop, and kickstarted many things, I hope. One very obvious big plus that came out of it was Craig Citro and Gonzalo Tornaria's work on massively optimizing the Cython-related dependency checking code in setup.py. The current implementation was some crappy thing I literally did in an hour, which has really bogged down as the Sage core library has gotten massive. Also, there's now a lot of momentum behind implementing Dokchitser's L-functions algorithm natively in Sage, which I'm greatly looking forward to -- Sourav San Gupta did much work on this at Sage Days 11, and has done a lot since too (in addition to his heavy load of grad courses). Mike Rubinstein and Rishi are wrapping Rubinstein's Lcalc using Cython as well, so Sage will soon go from the best system for computing with L-functions to even better!!

Yesterday I thought a bunch about the Sage/Magma interface and wrote several demos and test code. I'm still not 100% sure about how I want to do this -- there are numerous interesting subtleties with Magma. For example, if you create QQ[x,y] in Magma, then create QQ[x,y][z,q], the variables x,y from the first ring will *not* play nicely with the variables x,y in the second ring, which is surprising, since it is different than what happens with Sage. Anyway, this and many other problems are solvable and I'll be working on this again a lot tomorrow.

Thursday, November 6, 2008

Sage Days 11 in Austin Texas is tomorrow

I'm in Seattle, it's 6pm, and Sage Days 11 is in Austin, Texas tomorrow. I'm very excited, and I'm flying there over night. On the one hand, a red eye might make me "totally exhausted" for the conference tomorrow. On the other hand, I slept an average of 2-3 hours per night for a week during Sage Days 8 last time I was in Austin, and I'm reasonably caught up on sleep.

My main goal for the workshop is to continue work of Tim Dokchitser and Jennifer Balakrishnan to create a native implementation of Dokchiter's algorithm for computing L(f,s). Having this is suddenly incredibly important to my number theory research, so I'm finally motivated to want to get it into Sage.

I'll post about the workshop in my blog here.

Wednesday, October 1, 2008

Why I like Sage

Today's blog post is really from Jason Grout. It's why he likes Sage. This is from an email he sent out today, which I liked reading:

"It depends on the area, so you'll have to give me an area to get a more specific answer. In general, Sage has somewhat weaker general symbolic capabilities (i.e., integrals, etc.) than mathematica or maple (though usually this does not seem to be a problem in undergraduate-level problems). It has *much* stronger number theory functionality. Things are object-oriented in Sage and Sage understands mathematical structures and how they relate (using category theory). For example, Sage knows what a vector space is, what a finite field is, etc. You can actually create a finite field or an extension of the rationals and ask questions about it. You can create a polynomial ring over a field and then just work with it.

Sage is also generally faster than either Mathematica or Maple, in my experience.

The web interface to Sage is a huge plus to Sage over mathematica and maple. Of course, being free and open-source is something that is unmatched in either Mathematica or Maple; that is a very important point that is sometimes overlooked. You can literally see what is going on inside of Sage, where you have to guess what is happening in Mathematica or Maple.

One reason that Sage was chosen for an AIM workshop on helping undergraduate research was that the participants didn't have a common computational system (i.e., some had access to Mathematica, some had access to Maple, some had access to neither). They could use Sage because it was free, whereas it would have been problematic to insist that every person somehow acquire access to a specific piece of commercial software. Related to this, I had a student complain on my course evaluations about me using Mathematica in class because it is hard for our students here to have access to Mathematica, and they would have to pay in order to use it at home, etc.

If you are teaching future secondary ed teachers, then they most likely will not have access to Maple or Mathematica when they are teaching high school because of the cost. However, they *will* have access to Sage, so using Sage directly benefits their future students because whatever they learn can be used in their high school classes.

Another huge plus to Sage, in my eyes, is that it is based on one of the most prevalent and easiest-to-use computer languages around, Python. Students that learn to use Mathematica and Maple learn a language that they, in most likelyhood, will never use once they graduate. However, Python is used in many, many industries, so their python knowledge from using Sage is directly applicable later on.

Those are a few things that came to my mind right away. After some time thinking about it, I probably will have other things that make Sage more effective for me than other commercial software.

Thanks,

Jason"

Thursday, July 24, 2008

Austria, ISSAC, and Hidden Markov Models

Yesterday, I gave a controversial plenary lecture on Sage at the 2008 ISSAC symbolic computer algebra conference. It was well received by some proportion of the large audience of about 170 people, and will hopefully influence that research community to be more supportive of open source. In particular, I hope professors doing computer algebra research will allow their Ph.D. students to use open source software on research projects instead of forcing them to use Maple or Mathematica like most of them currently do at RISC.

Many people asked me what I thought of the ISSAC conference -- it was very similar to the yearly ANTS (Algorithmic Number Theory Symposium) meetings we number theorists have, but without number theorists. The meeting has a generally positive "vibe" and participants are enthusiastic about doing computation. My only criticism compared to ANTS is that the publication process for the proceedings isn't nearly as professional as what ANTS does -- the ISSAC publisher's website was in my opinion hell to use, working with the publisher to get my abstract in shape was no fun, and the final paper proceedings look like they were done at Kinko's, whereas ANTS proceedings are part of Springer-Verlag's lecture notes in computer science series, hence look very professional *and* are available online.

I also started looking at getting Hidden Markov Model functionality into Sage, since HMM's are very relevant to certain areas of machine learning, language processing, statistics, financial time series, etc., and Sage doesn't do much in that direction yet. I was prepared to have to write something from scratch myself in Cython, but quickly found GHMM.org, which is GPLv2+, actively used and developed, written in C with a Python interface, and with some work could possibly work very well for Sage. I would certainly rather spend a solid week writing high-quality documentation and tests (and reporting bugs) than months learning, implementing, and optimizing algorithms followed by a solid week writing high-quality documentation and tests, followed by months building a community of developers to maintain said code. The GHMM program linked to above only has an svn distribution and depends on xml, and it depends on swig. I've created an spkg that one can build into sage and which doesn't depend on libxml; it does assume you have swig installed, and takes about 30 seconds to install from source. It's installed into the system-wide sage on sage.math.washington.edu:


was@sage:~/patches$ sage
----------------------------------------------------------------------
| SAGE Version 3.0.5, Release Date: 2008-07-11                       |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------

sage: import ghmm
sage: ghmm.[tab key]
ghmm.Alphabet                          ghmm.AminoAcids

In a few hours Michael Abshoff and I are heading to Vienna to meet with Harald Schilly (who I've never met), who is the new sagemath.org webmaster.

Sunday, July 20, 2008

ISSAC

I arrived in Austria last night totally exhausted after spending some time in Amsterdam with Michael Abshoff, Hendrik Lenstra (who was my Ph.D. thesis adviser), Waldek Habisch (the FriCAS guy), and others.

All the students working on funded projects with me this summer have setup blogs, and I'm encouraging them to blog very frequently about their work on Sage. I will start doing the same. Here are links to their blogs:

The last thing I did on Sage was spent nearly a week working on making a release that has fixes for numerous subtle build (and other) bugs on a very wide range of Linux distributions and hardware, e.g., Itanium2's and 64-bit Pentium 4's.