Wednesday, September 30, 2015

What is SageMath's strategy?

Here is SageMath's strategy, or at least what my strategy toward SageMath has been for the last 5 years.

Diagnose the problem

Statement of problem: SageMath is not growing.

Justification

Facts: Growth in the number of active users [1] of SageMath has stalled since about 2011 (as defined by Google analytics on sagemath.org). From 2008 to 2011, year-on-year growth was about 50%, which isn't great. However, from 2011 to now, year-on-year growth is slightly less than 0%. It was maybe -10% from 2013 to 2014. Incidentally, number of monthly active users of sagemath.org is about 68,652 right now, but the raw number isn't as import as the year-to-year rate of change.

I set an overall mission statement for the Sage project at the outset, which was is to be a viable alternative to Magma, Maple, Mathematica and Matlab. Being a "viable alternative" is something that holds or doesn't for specific people. A useful measure of this mission then is whether or not people use Sage. This is a different metric than trying to argue from "first principles" by making a list of features of each system, comparing benchmarks, etc.















Guiding policies

Statement of policy: focus on undergraduate students in STEM courses (science, tech, engineering, math)

Justification

In order for Sage to start growing again, identify groups of people that are not using Sage. Then decide, for each of these groups, who might find value in using Sage, especially if we are able to put work into making it easier for them to benefit from Sage. This is something to re-evaluate periodically. In itself, this is very generic -- it's what any software project that wishes to grow should do. The interesting part is the details.
Some big groups of potential future users of Sage, who use Sage very little now, include
  • employees/engineers in various industries (from defense contractors, to finance, to health care to "data science").
  • researchers in area of mathematics where Sage is currently not popular
  • undergraduate students in STEM courses (science, tech, engineering, math)
I think by far the most promising group is "undergraduate students in STEM courses". In many cases they use no software at all or are unhappy with what they do use. They are extremely cost sensitive. Open source provides a unique advantage in education because it is less expensive than closed source software, and having access to source code is something that instructors consider valuable as part of the learning experience. Also, state of the art performance, which often requires enormous dedicated for-pay work, is frequently not a requirement.

Actions

  • (a) Make access to Sage as easy as possible.
  • (b) Encourage the creation of educational resources (books, tutorials, etc.) that make using Sage for particular courses as easy as possible.
  • (c) Implement missing functionality in Sage that is needed in support of undergraduate teaching.

Justification

Why don't more undergraduates use Sage? For the most part, students use what they are told to use by their instructors. So why don't instructors chose to use Sage? (a) Sage is not trivial to install (in fact it is incredibly hard to install), (b) There are limited resources (books, tutorials, course materials, etc.) for making using Sage really easy, (c) Sage is missing key functionality needed in support undergraduate teaching.

Regarding (c), in 2008 Sage was utterly useless for most STEM courses. However, over the years things changed for the better, due to the hard work of Rob Beezer, Karl Dieter, Burcin Erocal, and many others. Also, for quite a bit of STEM work, the numerical Python ecosystem (and/or R) provides much of what is needed, and both have evolved enormously in recent years. They are all usable from Sage, and making such use easier should be an extremely high priority. Related -- Bill Hart wrote "I recently sat down with some serious developers and we discussed symbolics in Sage (which I know nothing about). They argued that Sage is not a viable contender in that area, and we discussed some of the possible reasons for that. " The reason is that the symbolic functionality in Sage is motivated by making Sage useful for undergraduate teaching; it has nothing to do with what serious developers in symbolics would care about.

Regarding (b), an NSF (called "UTMOST") helped in this direction... Also, Gregory Bard wrote "Sage for undergraduates", which is exactly the sort of thing we should be very strongly encouraging. This is a book that is published by the AMS and is also freely available. And it squarely addresses exactly this audience. Similarly, the French book that Paul Zimmerman edited is fantastic for France. Let's make an order of magnitude similar resources along these lines! Let's make vastly more tutorials and reference manuals that are "for undergraduates".

Regarding (a), in my opinion the most viable option that fits with current trends in software is a full web application that provides access to Sage. SageMathCloud is what I've been doing in this direction, and it's been growing since 2013 at over 100% year on year, and much is in place so that it could scale up to more users. It still has a huge way to go regarding user friendliness, and it is still losing money every month. But it is a concrete action toward which nontrivial effort has been invested, and it has the potential to solve problem (a) for a large number of potential STEM users. College students very often have extremely good bandwidth coupled with cheap weak laptops, so a web application is the natural solution for them.

Though much has been done to make Sage easier to install on individual computers, it's exactly the sort of problem that money could help solve, but for which we have little money. I'm optimistic that OpenDreamKit will do something in this direction.

[I've made this post motivated by the discussion in this thread.  Also, I used the framework from this book.]