2017: Midsummer's Day update
Hardware upgrades for the GPU clusters, calculus replaced
- nvidia3 upgraded
- nvidia3 was upgraded last week with an additional 2TB of local disk storage, an operating system update to Ubuntu 16.04 LTS, the latest nVidia CUDA 8.0 development toolkit, cudNN 5.1 and the Tensorflow libraries and is available for use now.
- completion of memory upgrades for nvidia1 and nvidia2 postponed
- The memory available on the GPU cluster host servers nvidia1 and nvidia2 is no longer sufficient to run the larger jobs that are currently being run on these, causing both to use disk swap space instead which is much slower.
- A rolling upgrade of both servers has started, with nvidial initially being upgraded from 24 GB to 48 GB of memory and the two original 470 watt power supplies in the blade enclosure have been replaced with 1100 watt units to meet the increased power demand from the memory upgrades as well as the additional storage disks that were added recently.
- Unfortunately owing to high demand for GPU compute facilities, further hardware upgrades have now been postponed until July.
- calculus replaced.
- The calculus "instant extra filespace" server has been replaced by a new HP MicroServer gen 8 machine with twice as much installed memory along with the disk set from the old machine, which was suffering intermittent disk controller problems.
- Updated: June 21st
Even though we can't really have the good old-fashioned annual Factory Fortnight that used to be the norm in 20th century industrial Britain, when all the workers & management took their holidays at the same time in August, leaving factories clear for maintenance staff, fitters, machine tool engineers, painters, etc to move in and carry out an annual re-fit, we are now seriously looking at August as the best time to enjoy(!) a partial shutdown. This is necessary so that we can close down the bulk of the compute cluster that is currently accommodated in the ICT data centre and move it to the Maths server room along with the few remaining private servers still in ICT, ahead of the data centre's eventual closure next year.
- You'll find more information on August's partial shutdown here.
- Original posting: 23rd March
- Finally, we have started planning ahead for the eventual shutdown of the ICT data centre in the City & Guilds building over the next 18-24 months and its migration to a new facility in Slough. We have over 30 servers hosted in ICT - 24 of these are Maths compute cluster nodes which will probably move to Huxley 616 later this year along with a non-rackmount storage server while the rest will move to Slough, with hardware upgrades where necessary to support full remote management including "bare metal" installs, (the ability to install an operating system remotely from media in South Kensington without going anywhere near Slough, for example).
- Some private rackmount systems currently in Huxley 616 may have to move to Slough although this is not certain - if it's not possible to install an operating system remotely or if the owner is unwilling to pay for upgrades to allow this to be done, or replace the server with one that has full remote media support, then that server will have to stay in the Maths server room. The main issue with accommodating more systems in the Maths server room is it is not possible to install any more college network connections on Huxley level 6 since the racks in the network wiring cupboard opposite the south lifts are now full. So an influx of another 25 systems into Huxley 616 will potentially be a problem.
- For this reason we will have to move nearly all Maths compute cluster nodes onto a private dedicated network which means direct access to a particular node from the college network will no longer be possible although it would still be accessible from a designated head node. This is the norm on HPC clusters and should not inconvenience users that much and it does have the real advantage of faster cluster performance since cluster network traffic will not be mixed with non-compute traffic on the general college network.
Older news items:
- June 16th, 2017: mid-June update
- June 2nd, 2017: Early summer update
- April 20th, 2017: Spring update 2
- March 22nd, 2017: Early spring update
- March 10th, 2017: Winter update 2
- February 22nd, 2017: Winter update
- November 2nd, 2016: Autumn update
- October 21st, 2016: Late summer update 2
- October 14th, 2016: Late summer update
- February 19th, 2016: Winter update
- December 11th, 2015: Autumn update
- September 14th, 2015: Late summer update 2
- May 2nd, 2015: Spring update 2
- April 26th, 2015: Spring update
- November 11th, 2014: Autumn update
- September 17th, 2014: Summer update 2
- July 17th, 2014: Summer update
- March 15th, 2014: Spring update
- November 2nd, 2013: Summer update
- May 24th, 2013: Spring update
- January 23rd, 2013: Happy New Year!
- November 22nd, 2012: No news is good news...
- November 17th, 2011: A revamp for the Maths SSH gateways
- September 7th, 2011: Failed systems under repair
- August 14th, 2011: Introducing calculus, a new NFS home directory server for research users
- July 19th, 2011: a new staging server for the compute cluster
- July 19th, 2011: A new Matlab queue and improved queue documentation
- June 30th, 2011: Updated laptop backup scripts
- June 18th, 2011: More storage for the silo...
- June 16th, 2011: Yet more storage for the SCAN...
- June 10th, 2011: 3 new nodes added to the Maths compute cluster
- May 21st, 2011: Announcing SCAN large storage and subversion (SVN) servers
- May 26th, 2011: Reporting missing scratch disk on macomp01
- May 21st, 2011: Announcing completion of silo upgrades
- May 16th, 2011: Announcing upgrades for silo
- April 14th, 2011: Goodbye SCAN 3, hello SCAN 4
- March 26th, 2011: quickstart guide to using the Torque/Maui cluster job queueing system
- March 9th, 2011: automatic laptop backup/sync service, new collaboration systems launched
- May 20th, 2010: Scratch disks are now available on all macomp and mablad compute cluster systems
- March 11th, 2010: Introduing job queueing on the Fünf Gruppe compute cluster
- October 16th, 2008: Introduing the Fünf Gruppe compute cluster
- June 18th, 2008: German City compute farm now expanded to 22 machines
- February 7th, 2008: new applications on the Linux apps server, unclutter your desktop
- November 13th, 2007: aragon and cathedral now general access computers, networked Linux Matlab installation upgraded to R2007a
- September 14th, 2007: Problems with sending outgoing mail for UNIX & Linux users
- July 23rd, 2007: SCAN available full-time over the summer vacation, closure of Imperial's Usenet news server
- May 15th, 2007: Temporary SCAN suspension, closure of the Maths Physics computer room, new research computing facilities
- January 14th, 2005: Exchange mail server upgrade, spam filtering with pine and various other enhancements
Research Computing Manager,
Department of Mathematics
last updated: 21.6.2017