Two days ago there was no fast freely available open source implementation of computation of the Hermite Normal Form of a (square nonsingular) matrix. In fact the fastest implementations I know of -- in Gap, Pari, and NTL, are all massively slower (millions of times) than Magma. See Allan Steel's page on this algorithm.
I just spent the last few days with help from Clement Pernet and Burcin Erocal implementing a fast modular/p-adic Hermite Normal Form algorithm. As soon as the coefficients of the input matrix get at all large our implementation is now the fastest in the world available anywhere (e.g., faster than Magma, etc.). This is incredibly important for my research work on modular forms.
The following plot compares Sage (blue) to Magma (red) for computing the Hermite Normal Form of an nxn matrix with coefficients that have b bits. The x-axis is n, the y-axis is b, and the blue dots show where Sage is much better, whereas the red dots are where Magma is much better. The timings are on a 2.6Ghz core 2 duo laptop running OS X, and both Sage and Magma were built as 32-bit native executables:
(Note: When doing timings under Linux, Magma doing slightly better than it does above, though Sage is still much better as soon as the coefficients are at all large.)
This plot shows the timings of Magma (red) versus Sage (blue) for computing the Hermite Forms of matrices with entries between -212 and 212. Sage is much faster than Magma, since the blue line is lower (=faster).
For even bigger coefficients Sage has a much bigger advantage. E.g., for a 200 x 200 256-bit matrix, Magma takes 232 seconds whereas Sage takes only 55 seconds.
In order to make this happen we read some old papers, read notes from a talk that Allan Steel (who implemented HNF in Magma) gave, and had to think quite hard to come up with three tricks (2 of which are probably similar to tricks alluded to in Steel's talk notes, but with no explanation about how they work), coded for 3 days at Sage Days 7, discovered that we weren't right about some of our tricks, doing lots of fine tuning, and finally fixed all the problems and got everything to work. We will be writing this up, and of course all the code is open source.
Acknowledgement: Many many thanks to Allan Steel for implementing an HNF in Magma that blows away everything else, which gave us a much better sense of what is possible in practice. Thanks also to the authors of IML whose beautiful GPL'd implementation of balanced p-adic is critical to our fast HNF implementation.