July 14th: nvidia5 & boden GPU servers: nvidia5 remains powered off for the time being while a failure of the web interface to the remote management system is being resolved; boden has a failing hard disk for which a replacement has been ordered and is expected to arrive some time this week.
June 19th: forrest GPU server: is back in service after planned upgrade to the latest stable Linux with the latest available CUDA, cuDNN and Python3 software packages. However since it was not being used, it was powered off on June 30th to reduce heat loads in the server room.
June 14th: we are now monitoring temperature trends in the Maths server room and manually adjusting both the number of compute/GPU servers available at any one time and also, their their internal power controls to try and offer as many services as possible while at the same time keeping temperatures within an overall 30 deg C envelope. Currently (9:45am) the nvidia4, nvidia6 and boden GPU servers are available but with GPU card consumptions capped at 100 watts each, forrest and nvidia5 are powered off and other servers containing GPU cards are likewise operating with limited consumption.
Yesterday (Friday 13th) was exceptional - a combination of hot weather and lots of users starting compute jobs in the afternoon meant we had to contend with temperatures in the low 30's all the way up until midnight. Please consider running jobs overnight or in the mornings and let the server room have a break during the afternoon/early evening.
Maths server room cooling: one of the failed cooling systems in the Maths server room has now been repaired and is back in service so we were able to restart some services in May including the nvidia4, nvidia5 and nvidia6 GPU servers, coates, the Fansosearch cluster, the Mathphys compute servers and the athena node in the Stats' Hadoop cluster but at the end of May these had to be switched off again unfortunately.
The other unit will have to be replaced owing to unavailability of some spares for a 12 year-old air conditioning system; an order for its replacement has yet to be placed and we are still awaiting a date for its installation.
Research Computing Manager, Department of Mathematics
last updated: 15.7.2025