Sunday, October 11, 2009

The Sage Notebook

The Sage Notebook is the graphical user interface to the math software Sage. The notebook is perhaps somewhat unusual in that it is a web application that people also use locally. It's a large program that has been written and rewritten a few times since 2006 by myself, Tom Boothby, Alex Clemesha, and many, many other people.

Supported by an internal grant from University of Washington, I've been working on separating the notebook off from Sage and rewriting many parts of it to improve the robustness and fix subtle issues. My goal is that the notebook be robust, fast, scalable, and work well outside Sage. Also, it would be very nice to port it to run natively on Microsoft Windows.

Nearly two weeks ago I had the Sage notebook stabilized and all known new bugs fixed (after separating it off from sage as a separate program and rewriting the interface stuff). But I realized that it would be a total nightmare to introduce yet another ("Sage object") storage format, which would make refactoring code extremely painful. So, I created an "abstract storage layer" and implemented a storage system for everything in the Sage notebook which doesn't use any special Sage-related pickles. Some data is stored as pickled basic Python objects that can be read from any version of Python with or without Sage installed, but that is it. Rewriting the notebook to use an abstract storage layer is the sort of thing that at first seems like it will take a day, but then takes more than a week. Anyway, I did it (with help from Tim Dumol and Mitesh Patel).

I hope people will test! Please try it.


Wednesday, October 7, 2009

VirtualBox versus VMware Server

VMware is a program that allows you to (often safely) run several virtual computers on a single physical computer. I have been using VMware since about 1998, have purchased valid licenses several times with my own money, and have long considered VMware the sort of enterprise level specialized software that was unlikely to be available free and open source anytime in the near future.

All the web infrastructure for and many related projects was knocked offline for a full day last week due to problems I had with my VMware server installation. I thus decided to learn the free open source competitor VirtualBox and see how it compares to VMware, and share some of my experiences with you. I'm comparing to VMware server running on a very high end Linux host (Sun X4450) to VirtualBox running on the same Linux host. I vastly prefer VirtualBox now, even though I was a diehard VMware user since 1998.

Ways VirtualBox is better then VMware Server:

  1. Both are free, but VirtualBox is free and open source (GPLv2!), whereas VMware server is "free" and closed source. It's possible to buy VMware server + more features for thousands of dollars / year, and I even considered it, but I *couldn't* figure out what I would get for my money from the confusing VMware site, despite coming back to it several times.

  2. VirtualBox is massively better at supporting OS X than VMware server: it is *impossible* (without running yet another virtual machine on my laptop!) to use the VMware server console to connect to remote virtual machines on OS X, since VMware uses a proprietary browser plugin that is only available for Windows and Linux. In contrast, VirtualBox uses a standard remote desktop protocol, so it is easy to connect to running VirtualBox consoles from any operating system using a range of different remote desktop clients.

  3. VMware server limits me to 2 cores and 8GB RAM per machine. VirtualBox has no such limitations. This is a *huge* factor for me, since the host server has 128GB RAM and 24 cores!

  4. If a virtual machines runs under VMware server and uses 4GB RAM (say), then VMware server will silently (and completely hidden) allocate around 4GB of disk space on the host filesystem. My host server has 128GB RAM, but only a 70GB hard disk. The net result is that I can't use the resources I have -- I have to run far less machines than I could, or give them less RAM. VirtualBox doesn't allocate any "secret" disk space at all.

  5. The web interface to VMware is extremely flaky and frustrating. It randomly fails to work with many of my web browsers, I often have to restart it, and it is clunky. There is no finished web interface to VirtualBox yet, but there is VBoxWeb, which is looks like it will be good when it is finished. Also, the command line interface to VirtualBox is amazing; I think it is vastly superior to VMware Server's. Also, there is a Python API for scripting VirtualBox machines. After spending a day learning VirtualBox, I wrote my own scripts to automatically start from the console all of my virtual machines in the background, and open remote desktop ports for each. I can easily see what is running using another little script, etc. With VMware server, even starting 20 machines was a tedious and painful exercise (probably if I paid it would be easier).

  6. The VirtualBox documentation is better. It doesn't "talk down" to me like I feel VMware's documentation does. It's clear and useful.

  7. VirtualBox is a single program that runs on Solaris, OS X, Linux and Windows. VMware, in contrast, is several different programs -- VMware player, VMware workstation, VMware server (and several versions of that). It can be really confusing and frustrating, with artificial limitations put on every program so as to extract your money.

  8. VMware is blatantly commercial (they went public a few years ago). Sun is also commercial, but they have a strong commitment to open source. Of course, I don't know what will happen with the Oracle acquisition.

  9. VMware completely stopped working on my main virtualization server, and despite upgrades, clean installs, etc., I absolutely could not get it to work again. I had to switch to running it on another computer temporarily.