Tuesday, May 6, 2014

Update to Differential Synchronization in SageMathCloud

I've just pushed out a major update to how synchronization works in https://cloud.sagemath.com.

This change is pretty significant behind the scenes, but the only difference you should notice is that everything should be better. In particular:

  • evaluation of code in Sage worksheet should feel a little snappier and more robust,
  • various random and hard to reproduce issues with synchronized editing should be fixed, e.g. chat messages out of order, etc.
  • everything should generally be a bit faster and more scalable overall.

Here's a short technical description of what changed. The basic architecture of SageMathCloud is that there are many web browsers connected to many hubs, which are in turn connected to your project (and to many other projects too):

  [web browser] <- websocket ----\/
  [web browser] <------------> [hub]<------ tcp -------\/
                                                     [project]
  [web browser] <------------> [hub]<------------------/\

Until today, the differential synchronization implementation involved having a copy of the document you're editing on:

  1. each hub pictured above,
  2. in each browser, and
  3. in the project itself.

In particular, there were three slightly different implementations of differential synchronization running all over the place. The underlying core code is the same for all three, but the way it is used in each case is different, due to different constraints. The implementations:

  • browser: running in a web browser, which mainly has to worry about dealing with the CodeMirror editor and a flakie Internet connection.
  • hub: running in a node.js server that's also handling a lot of other stuff, including worrying about auth, permissions, proxying, logging, account creation, etc.
  • project: running in the project, which doesn't have to worry about auth or proxying or much else, but does have to worry about the filesystem.

Because we're using Node.js, all three implementations are written in the same language (CoffeeScript), and run the same underlying core code (which I BSD licensed at https://github.com/sagemath/cloud/blob/master/diffsync.coffee). The project implementation was easiest to write, since it's very simple and straightforward, and has minimal constraints. The browser implementation is mainly difficult, since the Internet comes and goes (as laptops suspend/resume), and it this involves patching and diff'ing a CodeMirror editor instance; CodeMirror is difficult, because it views the document as a line instead of a single long string, and we want things to work even for documents with hundreds of thousands of lines, so converting back and forth to a string is not an option! Implementing the hub part of synchronization is the hardest, for various reasons -- and debugging it is particularly hard. Moreover, computing diffs can be computationally expensive if the document is large, so doing anything involving differential sync on the hub can result in nontrivial locking cpu usage, hence slower handling of other user messages (node.js is single threaded). The hub part of the above was so hard to get right that it had some nasty locking code, which shouldn't be needed, and just looked like a mess.

A lot of issues people ran into with sync involved two browsers connected to different hubs, who then connected to the same document in a project. The two hubs' crappy synchronization would appear to work right in this part of the picture "[web browser] <------------> [hub]", but have problems with this part "[hub]<-------------->[project]", which would lead to pain later on. In many cases, the only fix was to restart the hub (to kill its sync state) or for the user to switch hubs (by clearing their browser cookies).

Change: I completely eliminated the hub from the synchronization picture. Now the only thing the hub does related to sync is forward messages back and forth between the web browser and local hub. Implementing this was harder than one might think, because the the project considered each client to be a single tcp connection, but now many clients can connect a project via the same tcp connection, etc.

With this fix, if there are any bugs left with synchronization, they should be much easier to debug. The backend scalability and robustness of sync have been my top priorities for quite a while now, so I'm happy to get this stuff cleaned up, and move onto the next part of the SMC project, which is better collaboration and course support.

Thursday, May 1, 2014

What can SageMathCloud (SMC) do?

The core functionality of SageMathCloud:

  • Color command-line Terminal with many color schemes, which several people can interact with at once, with state that survives browser refresh.
  • Editing of documents: with syntax highlighting, auto-indent, etc., for files with the following extensions:
    c, c++, cql, cpp, cc, conf, csharp, c#, coffee, css, diff, dtd, e, ecl, f, f90, f95, h, hs, lhs, html, java, jl, js, lua, m, md, mysql, patch, gp, go, pari, php, py, pyx, pl, r, rst, rb, ru, sage, sagews, scala, sh, spyx, sql, txt, tex, toml, bib, bbl, xml, yaml.
(It's easy for me to add more, as CodeMirror supports them.) There are many color schemes and Emacs and Vim bindings.
  • Sage Worksheets: a single document interactive way to evaluate Sage code. This is highly extensible, in that you can define % modes by simply making a function that takes a string as input, and use %default_mode to make that mode the default. Also, graphics actually work in the %r automatically, exactly as in normal R (no mucking with devices or png's).
  • IPython notebooks: via an IPython session that is embedded in an iframe. This is synchronized, so that multiple users can interact with a notebook simultaneously, which was a nontrivial addition on top of IPython.
  • LaTeX documents: This fully supports sagetex, bibtex, etc. and the LaTeX compile command is customizable. This also has forward and inverse search, i.e., double click on preview to get to point in tex file and alt-enter in tex file to get to point in LaTeX document. In addition, this editor will work fine with 150+ page documents by design. (Editing multiple document files are not properly supported yet.)
  • Snapshots: the complete state of all files in your project are snapshotted (using bup, which is built on git) every 2 minutes, when you're actively editing a file. All of these snapshots are also regularly backed up to encrypted disks offsite, just in case. I plan for these highly efficient deduplicated compressed snapshots to be saved indefinitely. Browse the snapshots by clicking "Snapshots" to the right when viewing your files or type cd ~/.snapshots/master in the terminal.
  • Replication: every project is currently stored in three physically separate data centers; if a machine or data center goes down, your project pops up on another machine within about one minute. A computer at every data center would have to fail for your project to be inaccessible. I've had zero reports of projects being unavailable since I rolled out this new system 3 weeks ago (note: there was a project that didn't work, but that was because I had set the quota to 0% cpu by accident).

Sage

The Sage install contains the following extra packages (beyond what is standard in Sage itself). When you use Sage or IPython, this will all be available.
basemap, biopython, biopython, bitarray, brian, cbc, chomp, clawpack, cluster_seed, coxeter3, cryptominisat, cunningham_tables, database_cremona_ellcurve, database_gap, database_jones_numfield, database_kohel, database_odlyzko_zeta, database_pari, database_symbolic_data, dot2tex, fabric, gap_packages, gnuplotpy, greenlet, guppy, h5py, httplib2, kash3, lie, lrs, lxml, mahotas, mercurial, mpld3, munkres, mysql-python, nauty, netcdf4, neuron, normaliz, nose, nose, numexpr, nzmath, oct2py, p_group_cohomology, pandas, paramiko, patsy, patsy, phc, plotly, psutil, psycopg2, pybtex, pycryptoplus, pyface, pymongo, pyproj, pyx, pyzmq, qhull, quantlib, redis, requests, rpy2, scikit_learn, scikits-image, scimath, shapely, simpy, snappy, statsmodels, stein-watkins-ecdb, tables, theano, topcom, tornado, traits, xlrd, xlwt, zeromq

R

Also, I install the following extra packages into the R that is in Sage:
KernSmooth, Matrix, Rcpp, cairo, car, circular, cluster, codetools, e1071, fields, ggplot2, glmnet, lattice, mgcv, mvtnorm, plyr, reshape2, rpart, stringr, survival, zoo

It's Linux

SMC can do pretty much anything that doesn't require X11 that can be done with an Ubuntu-14.04 Linux can be done. I've pre-installed the following packages, and if people want others, just let me know (and they will be available to all projects henceforth):
vim git wget iperf dpkg-dev make m4 g++ gfortran liblzo2-dev libssl-dev libreadline-dev  libsqlite3-dev libncurses5-dev git zlib1g-dev openjdk-7-jdk libbz2-dev libfuse-dev pkg-config libattr1-dev libacl1-dev par2 ntp pandoc ssh python-lxml  calibre  ipython python-pyxattr python-pylibacl software-properties-common  libevent-dev xfsprogs lsof  tk-dev  dstat emacs vim texlive texlive-* gv imagemagick octave mercurial flex bison unzip libzmq-dev uuid-dev scilab axiom yacas octave-symbolic quota quotatool dot2tex python-numpy python-scipy python-pandas python-tables libglpk-dev python-h5py zsh python3 python3-zmq python3-setuptools cython htop ccache python-virtualenv clang libgeos-dev libgeos++-dev sloccount racket libxml2-dev libxslt-dev irssi libevent-dev tmux sysstat sbcl gawk noweb libgmp3-dev ghc  ghc-doc ghc-haddock ghc-mod ghc-prof haskell-mode haskell-doc subversion cvs bzr rcs subversion-tools git-svn markdown lua5.2 lua5.2-*  encfs auctex vim-latexsuite yatex spell cmake libpango1.0-dev xorg-dev gdb valgrind doxygen haskell-platform haskell-platform-doc haskell-platform-prof  mono-devel mono-tools-devel ocaml ocaml-doc tuareg-mode ocaml-mode libgdbm-dev mlton sshfs sparkleshare fig2ps epstool libav-tools python-software-properties software-properties-common h5utils libnetcdf-dev netcdf-doc netcdf-bin tig libtool iotop asciidoc autoconf bsdtar attr  libicu-dev iceweasel xvfb tree bindfs liblz4-tool tinc  python-scikits-learn python-scikits.statsmodels python-skimage python-skimage-doc  python-skimage-lib python-sklearn  python-sklearn-doc  python-sklearn-lib python-fuse cgroup-lite cgmanager-utils cgroup-bin libpam-cgroup cgmanager cgmanager-utils cgroup-lite  cgroup-bin r-recommended libquantlib0 libquantlib0-dev quantlib-examples quantlib-python quantlib-refman-html quantlib-ruby r-cran-rquantlib  libf2c2-dev libpng++-dev libcairomm-1.0-dev r-cran-cairodevice x11-apps mesa-utils libpangox-1.0-dev
I've also put extra effort (beyond just apt-get) to install the following:
polymake, dropbox, aldor/"AXIOM", Macaulay2, Julia, 4ti2

Functionality that is currently under development

We're working hard on improving SageMathCloud right now.

  • Streamlining document sync: will make code evaluation much faster, eliminate some serious bugs when the network is bad, etc.
  • Geographic load balancing and adding data centers, so that, e.g., if you're in Europe or Asia you can use SMC with everything happening there. This will involve DNS load balancing via Amazon Route 53, and additionally moving projects to run on the DC that is nearest you on startup, rather than random. Right now all computers are in North America.
  • Mounting a folder of one project in another project, in a way that automatically fixes itself in case a machine goes down, etc. Imagine mounting the projects of all 50 students in your class, so you can easily assign and collect homework, etc.
  • Homework assignment and grading functionality with crowdsourcing of problem creation, and support for peer and manual grading.
  • BSD-licensed open source single-project version of SMC.
  • Commercial software support and instructions for how to install your own into SMC (e.g., Mathematica, Matlab, Magma, etc.)
  • ssh access into projects easily

Friday, April 25, 2014

The SageMathCloud Roadmap

Everything below is subject to change.

Implementation Goals

  • (by April 27) Major upgrades -- update everything to Ubuntu 14.04 and Sage-6.2. Also upgrade all packages in SMC, including Haproxy, nginx, stunnel, etc.

  • (by May 4) Streamline doc sync: top priority right now is to clean up sync, and eliminate bugs that show up when network is bad, many users, etc.

  • (by May 30) Snapshots:
    • more efficient way to browse snapshot history (timeline view)
    • browse snapshots of a single file (or directory) only
  • (by May 30) User-owned backups
    • way to download complete history of a project, i.e,. the underlying bup/git repository with snapshot history.
    • way to update offline backup, getting only the changes since last download.
    • easy way to download all current files as a zip or tarball (without the snapshots).
  • (by June 30) Public content
    • Ability to make a read-only view of the project visible publicly on the internet. Only works after the account is at least n days old.
    • By default, users will have a "report spammer" button on each page. Proven 'good users' will have button removed. Any valid reported users will be permanently banned.
    • Only users with validated .edu accounts will be allowed to publish for starters. Maybe allow gmail.com at some point.
  • (by June 30) Fix all major issues (none major) that are listed on the github page: https://github.com/sagemath/cloud/issues

  • (by July 31) Group/social features:
    • Support for mounting directories between projects
    • Group management: combine a group of users and projects into a bigger entity:
      • a University course -- see feature list below
      • a research group: a collection of projects with a theme that connects them, where everybody has access to everybody else's projects
    • A feed that shows activity on all projects that users care about, with some ranking. Better notifications about chat messages and activity

Commercial Products

We plan four distinct products of the SMC project: increased quotas, enhanced university course support, license and support to run a private SMC cloud, supported open source BSD-licensed single-user version of SMC (and offline mode).

  • PRODUCT: Increase the quota for a specific project (launch by Aug 2014)
    • cpu cores
    • RAM
    • timeout
    • disk space
    • number of share mounts
  • Remarks:
    • There will be an option in the UI to change each of the above parameters that some project collabs see (maybe only owners initially).
    • Within moments of making a change it goes live and billing starts.
    • When the change is turned off, billing stops. When a project is not running it is not billed. (Obviously, we need to add a stop button for projects.)
    • There is a maximum amount that the user can pre-spend (e.g., $500 initially).
    • At the end of the month, the user is given a link to a Univ of Washington website and asked to pay a certain amount, and register there under the same email as they use with SMC.
    • When they pay, SMC receives an email and credits their account for the amount they pay.
    • There will also be a limit soon on the number of projects that can be associated with an account (e.g., 10); pay a small monthly to raise this.
  • PRODUCT: University course support (launch by Aug 2014 in time for Fall 2014 semester)

    • Free for the instructor and TA's
    • Each student pays $20 in exchange for:
      • one standard project (they can upgrade quotas as above), which TA and instructor are automatically collabs on
      • student is added as collaborator to a big shared project
      • in student's private project they get homework assignments (assigned, collected)
    • Instructor's project has all student projects as mounted shares
    • Instructor has a student data spreadsheet with student grades, project ids (links), etc.
    • Powerful modern tool for designing homework problems that can be automatically graded, with problems shared in a common pool, with ratings, and data about their usage.
    • A peer grading system for more advanced courses.
    • Tools to make manual grading more fun.
  • PRODUCT: License to run a private SMC cloud in a research lab, company, supercomputer, etc. (launch a BETA version by July 2014, with caveats about bugs).

    • base fee (based on organization size)
    • technical support fee
    • site visit support: install, run workshop/class
  • PRODUCT: Free BSD-licensed single-user account version of SMC (launch by December 2014)

    • a different way to do LaTeX editing, manage a group of IPython notebooks, use Sage worksheets, etc.
    • be included with Sage.
    • be included in many Linux distros
    • the doc synchronization code, local_hub, CoffeeScript client, terminal.
    • mostly Node.js application (with a little Python for Sage/IPython).
    • ability to sync with a cloud-hosted SMC project.
    • sell pre-configured (or just support) for a user to install standalone-SMC on some cloud host such as EC2 or Digital Ocean

Tuesday, April 15, 2014

SageMathCloud's new storage architecture

Keywords: ZFS, bup, rsync, Sage

SageMathCloud (SMC) is a browser-based hosted cloud computing environment for easily collaborating on Python programs, IPython notebooks, Sage worksheets and LaTeX documents. I spent the last four months wishing very much that less people would use SMC. Today that has changed, and this post explains some of the reasons why.

Consistency Versus Availability

Consistency and availability are competing requirements. It is trivial to keep the files in a SageMathCloud project consistent if we store it in exactly one place; however, when the machine that project is on goes down for any reason, the project stops working, and the users of the project are very unhappy. By making many copies of the files in a project, it's fairly easy to ensure that the project is always available, even if network switches in multiple data centers completely fail, etc. Unfortunately, if there are too many users and the synchronization itself puts too heavy of a load on the overall system, then machines will fail more frequently, and though projects are available, files do not stay consistent and data is lost to the user (though still "out there" somewhere for me to find).

Horizontal scalability of file storage and availability of files are also competing requirements. If there are a few compute machines in one place, then they can all mount user files from one central file server. Unfortunately, this approach leads to horrible performance if instead the network is slow and has high latency; it also doesn't scale up to potentially millions of users. A benchmark I care about is downloading a Sage binary (630MB) and extracting it (creating over 70,000 files); I want this to take at most 3 minutes total, which is hard using a networked filesystem served over the general Internet between data centers. Instead, in SMC, we store the files for user projects on the compute machines themselves, which provides optimal speed. Moreover, we use a compressed filesystem, so in many cases read and write speeds are nearly twice as fast as they might be otherwise.

New Architecture of SageMathCloud

An SMC project with id project_id consists of two directories of files, replicated across several machines using rsync:
  1. The HOME directory: /projects/project_id
  2. A bup repository: /bup/bups/project_id
Users can also create files they don't care too much about in /scratch, which is a compressed and deduplicated ZFS filesystem. It is not backed up in any way, and is local to that compute.

The /projects directory is one single big ZFS filesystem, which is both lz4 compressed and deduplicated. ZFS compression is just plain awesome. ZFS deduplication is much more subtle, as deduplication is tricky to do right. Since data can be deleted at any time, one can't just use a bloom filter to very efficiently tell whether data is already known to the filesystem, and instead ZFS uses a much less memory efficient data structure. Nonetheless, deduplication works well in our situation, since the compute machines all have sufficient RAM (around 30-60GB), and the total data stored in /projects is well under 1TB. In fact, right now most compute machines have about 100GB stored in /projects.
The /bup/bups directory is also one single big ZFS filesystem; however, it is neither compressed nor deduplicated. It contains bup repositories, where bup is an awesome git-based backup tool written in Python that is designed for storing snapshots of potentially large collections of arbitrary files in a compressed and highly deduplicated way. Since the git pack format is already compressed and deduplicated, and bup itself is highly efficient at deduplication, we would gain almost nothing by using compression or deduplication directly on this ZFS filesystem. When bup deduplicates data, it does so using a sliding window through the file, unlike ZFS which simply breaks the file up into blocks, so bup does a much better job at deduplication. Right now, most compute machines have about 50GB stored in /bup/bups.

When somebody actively uses a project, the "important" working files are snapshotted about once every two minutes. These snapshots are done using bup and stored in /bup/bups/project_id, as mentioned above. After a snapshot is successfully created, the files in the working directory and in the bup repository are copied via rsync to each replica node. The users of the project do not have direct access to /bup/bups/project_id, since it is of vital importance that these snapshots cannot be corrupted or deleted, e.g., if you are sharing a project with a fat fingered colleague, you want peace of mind that even if they mess up all your files, you can easily get them back. However, all snapshots are mounted at /projects/project_id/.snapshots and browseable by the user; this uses bup's FUSE filesystem support, enhanced with some patches I wrote to support file permissions, sizes, change times, etc. Incidentally, the bup snapshots have no impact on the user's disk quota.

We also backup all of the bup archives (and the database nodes) to a single large bup archive, which we regularly backup offsite on encrypted USB drives. Right now, with nearly 50,000 projects, the total size of this large bup archive is under 250GB (!), and we can use it efficiently recover any particular version of any file in any project. The size is relatively small due to the excellent deduplication and compression that bup provides.

In addition to the bup snapshots, we also create periodic snapshots of the two ZFS filesystems mentioned above... just in case. Old snapshots are regularly deleted. These are accessible to users if they search around enough with the command line, but are not consistent between different hosts of the project, hence using them is not encouraged. This ensures that even if the whole replication/bup system were to somehow mess up a project, I can still recover everything exactly as it was before the problem happened; so far there haven't been any reports of problems.

Capacity

Right now there are about 6000 unique weekly users of SageMathCloud and often about 300-400 simultaneous users, and there are nearly 50,000 distinct projects. Our machines are at about 20% disk space capacity, and most of them can easily be expanded by a factor of 10 (from 1TB to 12TB). Similarly, disk space for our Google compute engine nodes is $0.04 GB / month. So space-wise we could scale up by a factor of 100 without too much trouble. The CPU load is at about 10% as I write this, during a busy afternoon with 363 clients connected very actively modifying 89 projects. The architecture that we have built could scale up to a million users, if only they would come our way...

Wednesday, February 12, 2014

What is SageMathCloud?

The two main reasons for existence of SageMathCloud (SMC) are...

Goal 1. Increase resource for Sage: Generate a different longterm revenue stream to support development of Sage, i.e., open source mathematical software. By "different", I mean different than government and foundation grants and donations, which are relatively limited for primarily pure mathematics software development, which is what Sage specializes in. Even in my wildest dreams, it is very unlikely Sage will get more than a million dollars a year in funding (and in practice it gets a lot less); however, a successful commercial product with wide adoption has the potential to generate significantly more than a million dollars a year in revenue -- of course most would go back into the product... but when the product is partly Sage, that's fine. The National Science Foundation (and other donors) have played a major part during the last 8 years in funding Sage, but I think everybody would benefit from another funding source.

Goal 2. Increase the usage of Sage: The number of unique visitors per month to http://sagemath.org grew nicely from 2005 (when I started Sage) until Summer 2011, after which point it has remained fairly constant at 70,000 unique visitors. There is no growth at all: it was 70,332 in Jan 2011, and it was 70,449 last month (Jan 2014), both with a bounce rate of about 50%. A significant obstruction to growth is accessible, which SMC helps to address for certain users (last month the SMC website has 17,700 unique visitors with a bounce rate of about 30%).

Here's an actual email I received from somebody literally as I was writing this, which I think illustrates how SMC addresses the second goal:

    Hey William,

    Today I stopped by cloud.sagemath.com because 
    I wanted to do some computation with sage, and 
    cloud is announced in a big way on sagemath.org

    This is after a lengthy hiatus from computing
    with sage ( maybe a year ).

    Using cloud.sagemath.com completely blew my 
    mind.  At first I did not really understand 
    why sagenb was ditched after all the work that 
    went into it.  But man, cloud is really a 
    pleasure to use !

    I just wanted to share the joy :)

    Thanks for all that you do !

Licensing and Reuse of the SageMathCloud Codebase

The design and coding of SageMathCloud (SMC) has been mostly supported by University of Washington (UW). Due to goal 1 above, I have been working from the start (before a line of code was written) with the commercialization/tech transfer office of UW, who (because of 1) are not enthusiastic about simply open source the whole SMC codebase, as a condition for their help with commercialization. Some of SMC is open sourced, mainly the code that runs on the VM's and some of the HTML5 client that runs on the browser. We also plan to make the HTML5 client and a mini server BSD licensed, and include them with Sage (say) as a new local graphical interface. Of course SMC builds on top of many standard open source libraries and tools (e.g., CodeMirror, Cassandra, ZFS, Node.js, etc.).

There is, however, a large amount of interesting backend code, which is really the "cloud" part of SMC, and which we do not intend to release as open source. We do intend to sell licenses (with support) for the complete package, when it is sufficiently polished, since many organizations want to run their own private SMC servers, mainly for confidentiality reasons.

Goal 2 above mainly impacts how we market SMC. However, it's easy to completely ignore Sage and still get a lot of value out of SMC. I just glanced at what people are doing as I write this, and the result seems pretty typical: latex'ing documents, some Sage worksheets, some IPython notebooks, editing a perl script.

It's important to understand how SMC is different than other approaches to cloud computing. It's designed to make certain things very easy, but they are quite different things than what "traditional" cloud stacks like OpenStack are designed to make easy. SMC is supposed to make the following easy:

  • using Sage and IPython, both command line and notebook interfaces.
  • writing a paper using LaTeX (possibly with a specific private list of collaborators),
  • editing source code, e.g., developing Python/C/etc., libraries., again possibly with realtime collaboration.
  • creating collaborative "projects", which are really a Linux account on a machine, and provide isolation from other projects.
  • backups: all data is automatically snapshotted frequently
  • high availability: failure of a machine (or even whole data center) results in at most a few minutes of lost time/work.
  • speed: files are stored on a compressed local filesystem, which is snapshotted and replicated out regularly; thus the filesystem feels fast and is scalable, as compared to a networked filesystem.

The above design goals are useful for certain target audiences, e.g., people doing Sage/Python/etc. development, teachers and students in courses that make use of Sage/Python/etc., collaborative math research projects. SMC is designed so that a large number of people can make simultaneous small use of ever-expanding resources. SMC should also fully support the "social networks" that form in this context. At the same time, it's critical that SMC have excellent uptime and availability (and offsite backups, just in case), so that people can trust it. By trust, I don't mean so much in the sense of "trust it with proprietary info", but in the sense of "trust it to not just loose all my data and to be there when I'm giving a talk/teaching a class/need to do homework/etc.".

However, exactly the above design goals are at odds with some of goals of large-scale scientific/supercomputing. The following are not design goals of SMC:

  • supercomputing -- have large data that many distributed processes operate on: exactly what people often do on supercomputers (or with Hadoop, etc.)
  • traditional "cloud computing" -- dynamically spin up many VM's, run computations on them; then destroy them. With SMC, things tend to get created but not destroyed (e.g., projects and files in them), and a full VM is much too heavy given the number of users and type of usage that we have already (and plan to have).

What happens in practice with SMC is that people run smaller-scale computations on SMC (say things that just take a few cores), and when they want to run something bigger, they ssh from SMC to other resources they have (e.g., a supercomputer account) and launch computations there. All project collaborators can see what anybody types in a terminal, which can be helpful when working with remote compute clusters.

Anyway, I hope this helps to clarify what exactly SMC actually is.

Monday, December 16, 2013

Holiday Coding the SageMath Cloud

I love the Holiday break.  I get to work on https://cloud.sagemath.com (SMC) all day again!   Right now I'm working on a multi-data center extension of http://www.gluster.org for storing a large pool of sparse compressed deduplicated ZFS image files that are efficiently replicated between data centers.  Soon SMC projects will all be hosted in this, which will mean that they can very quickly be moved between computers, are available even if all but one data center goes down, and will have ZFS snapshots instead of the current snapshot system.  ZFS snapshots are much better for this application, since you can force them to happen at a point in time, with tags, and also delete them if you want.  A little later I'll even make it so you can do a full download (to your computer) of an SMC project (and all snapshots!) by just downloading the ZFS image file and mounting it yourself. 

I'm also continuing to work on adding a Google Compute Engine data center; this is the web server parts hosted there right now https://108.59.84.126/,    but the real interesting part will be making compute nodes available, since the GCE compute nodes are very fast.   I'll be making 30GB RAM 8-core instances available, so one can start a project there and just get access to that -- for free for to SMC users, despite the official price being $0.829/hour.    I hope this happens soon. 




Tuesday, December 10, 2013

The Sagemath Cloud: a minute "elevator description"

The Sagemath Cloud combines open source technology that has come out of cloud computing and mathematical software (e.g., web-based Sage and IPython worksheets) to make online mathematical computation easily accessible. People can collaboratively use mathematical software, author documents, use a full command line terminal, and edit complicated computer programs, all using a standard web browser with no special plugins. The core design goals of the site are collaboration and very high reliability, with data mirrored between multiple data centers. The current dedicated infrastructure should handle over a thousand simultaneous active users, and the plan is to scale up to tens of thousands of users as demand grows (about 100 users sign up each day right now). Most open source mathematical software is pre-installed, and users can also install their own copies of proprietary software, if necessary. There are currently around 1000 users on the site each day from all over the world.

The Sagemath Cloud is under very active development, and there is an ongoing commercialization effort through University of Washington, motivated by many users who have requested more compute power, disk space, or the option to host their own install of the site. Also, though the main focus is on mathematics, the website has also been useful to people in technical areas outside mathematics that involve computation.