Latest news for the Stats compute cluster


Welcome to the new Stats HPC cluster

Launched in January 2022, this new cluster is a scaled-down version of the main Maths NextGen cluster and in fact uses some hardware taken from that cluster when it was upgraded in October 2021. With a combined submission/compute node/cluster management server (fallas) and 10 additional compute nodes, it is specifically tailored for the Stats section's use.

fallas is the entry point to this cluster and assuming you already have access to this system, access is via ssh using your College computer account username and password as before. The former stand-alone compute servers festival and fiesta have now been integrated into the cluster as compute nodes, along with eight additional nodes stats01 to stats08 inclusive. Full documentation on this new facility can be found here.

Can I still access legacy data stored on fallas, festival & fiesta?

Yes - user data currently stored on local disks in fallas, festival and fiesta is still available on these disks. You can access fallas exactly as you did before it was converted to a HPC node and, at present, you can access festival or fiesta from fallas by first logging into fallas and from there, into festival or fiesta as appropriate:

ssh fallas

and then:

ssh festival

or

ssh fiesta

where you will find your local festival and fiesta files as before.

However, we are moving away from storing user data on local disks on the smaller compute systems in favour of making more use of networked storage so the data currently on the legacy fallas, festival and fiesta servers will eventually move to a dedicated storage server that will be added to the Stats HPC cluster in due course. After this has been done, there will be no user data stored on these systems although you will still be able to access fiesta and festival via fallas as described above.

Instead of logging into fiesta or festival from fallas, you could also use the -I option to qsub to start an interactive Torque job, presenting you after a few seconds delay with a shell on the next available compute node as shown in this example:

andy@fallas:~$ qsub -I
qsub: waiting for job 16.fallas.statscluster.ma.ic.ac.uk to start
qsub: job 16.fallas.statscluster.ma.ic.ac.uk ready

andy@fiesta:~$
and typing 'exit' will end this session:
qsub: job 16.fallas.statscluster.ma.ic.ac.uk completed
andy@fallas:~$

However, you can't choose the node you want to connect to in this way; Torque will connect you to the next least-heavily loaded node which may not be of much use if you want to access files on a particular node.

Cluster availability

Since usage demand for this cluster is likely to be very low initially, only four of these nodes - fallas, festival, fiesta & stats01 - are currently powered on. This is mainly an energy-saving measure since there is no point in having a group of servers using power and generating heat (which in turn means even more power being used to take the heat away) if they are just sitting doing nothing. If demand increases, for example, during the summer MSc project period, then more nodes will be powered up to meet this.

Any questions?

As always, I'll be happy to answer any questions you may have.



Andy Thomas

Research Computing Manager,
Department of Mathematics

last updated: 7.1.2022