Current news...

2016: Late summer update

Quick summary: more power supplies and much better cooling in server room, resumption of full services

Unfortunately there's nothing very exciting to report this time - summer has literally flown by leaving behind a feeling that nothing substantial has been accomplished with advancing research IT facilities. Cooling problems in the Maths server room dominated much of spring and summer this year, with a lot of time spent continually juggling servers, services and power consumption to keep equipment temperatures at a safe level (we gave up trying to keep the room itself at sensible temperatures for mere humans). At times, this meant curtailing - or even completely halting - availability of some systems and services although with half of the compute cluster being accommodated in the ICT data centre, we were able to minimise the impact of these shutdowns on the compute cluster, which has remained at at least 50% utilisation throughout.

In September at long last, more power supplies were installed for future use along with two additional cooling systems; the latter are highly effective, with ducting designed to directly suck in hot air from behind the racks, chill it and dump a large mass of cold air into the centre of the room from which servers will draw in cold air. We can now devote a lot more time to IT instead of watching thermometers and responding to overheat alerts!

Currently in October, work is under way on the following initiatives:

  • adding 2 TB of fast local storage to each GPU cluster

  • moving the job accounting/tracking databases used by the compute cluster website to a separate and much faster database server, which will speed up the website and possibly allow us to reintroduce some features we once had back in the early days, when there were only a few thousand jobs in the system

  • building more resilience into the server room infrastructure, by upgrading and adding mirrored/failover systems and better facilities for remote management

  • migrating the Stats Hadoop cluster internal network from 1 gigabit to 10 gigabit optical fibre

  • upgrading the clustor to cluster-backup link to 10 gigabit to speed up server mirroring

Older news items:

February 19th, 2016: Winter update
December 11th, 2015: Autumn update
September 14th, 2015: Late summer update 2
May 2nd, 2015: Spring update 2
April 26th, 2015: Spring update
November 11th, 2014: Autumn update
September 17th, 2014: Summer update 2
July 17th, 2014: Summer update
March 15th, 2014: Spring update
November 2nd, 2013: Summer update
May 24th, 2013: Spring update
January 23rd, 2013: Happy New Year!
November 22nd, 2012: No news is good news...
November 17th, 2011: A revamp for the Maths SSH gateways
September 7th, 2011: Failed systems under repair
August 14th, 2011: Introducing calculus, a new NFS home directory server for research users
July 19th, 2011: a new staging server for the compute cluster
July 19th, 2011: A new Matlab queue and improved queue documentation
June 30th, 2011: Updated laptop backup scripts
June 18th, 2011: More storage for the silo...
June 16th, 2011: Yet more storage for the SCAN...
June 10th, 2011: 3 new nodes added to the Maths compute cluster
May 21st, 2011: Announcing SCAN large storage and subversion (SVN) servers
May 26th, 2011: Reporting missing scratch disk on macomp01
May 21st, 2011: Announcing completion of silo upgrades
May 16th, 2011: Announcing upgrades for silo
April 14th, 2011: Goodbye SCAN 3, hello SCAN 4
March 26th, 2011: quickstart guide to using the Torque/Maui cluster job queueing system
March 9th, 2011: automatic laptop backup/sync service, new collaboration systems launched
May 20th, 2010: Scratch disks are now available on all macomp and mablad compute cluster systems
March 11th, 2010: Introduing job queueing on the Fünf Gruppe compute cluster
October 16th, 2008: Introduing the Fünf Gruppe compute cluster
June 18th, 2008: German City compute farm now expanded to 22 machines
February 7th, 2008: new applications on the Linux apps server, unclutter your desktop
November 13th, 2007: aragon and cathedral now general access computers, networked Linux Matlab installation upgraded to R2007a
September 14th, 2007: Problems with sending outgoing mail for UNIX & Linux users
July 23rd, 2007: SCAN available full-time over the summer vacation, closure of Imperial's Usenet news server
May 15th, 2007: Temporary SCAN suspension, closure of the Maths Physics computer room, new research computing facilities
January 14th, 2005: Exchange mail server upgrade, spam filtering with pine and various other enhancements

Andy Thomas

Research Computing Manager,
Department of Mathematics

last updated: 20.10.2016