August 2017 – Scientific Compute and Infrastructure@ Sanger

15th August 2017

Additional 3.5PB S3/Ceph storage available for use

We are happy to announce that the final installment of the requested additional 3.5PB usable Ceph / S3 storage is both fully installed and is now available for use by our Human genetics groups.

The turnaround time from delivery to hand over has been less than 1 month for all storage sets and despite some last mile hardware issues, the service is running well.

15th August 2017

More Ceph capacity for the Flexible Compute Environment

Following the success stories from our @scale customers, we have been asked to provide an additional 1PB of usable capacity for our internal flexible compute environment.

The order has been placed and we expect BIOS-IT to have the hardware on site and acceptance tested but the end of September 2017.

We continue to be impressed by the resilience of both the Ceph and S3 services the the current platform has provided since Jan 2017 and we look forward to seeing the performance of the infrastructure continue to scale as additional units are added.

15th August 201715th August 2017

New hardware incoming Lustre 112/113 upgrade time

After many years of robust service, our Lustre scratch112 and 113 systems are approaching end of life. We are delighted to say that following significant testing and a procurement bakeoff, we have selected Seagate-Cray as the vendor of choice for our replacement system.

The system will be based around Seagates Nitro SSD cache configuration to get the best fit for our mixed workloads. We look forward to receiving the new hardware by the end of September 2017.

15th August 2017

New RedHat test hosts available

2 new RedHat hosts have been installed in farm3 and are available for testing through the retest queue.

We are looking for feedback on this updated operating system as we are proposing to update our clusters later this financial year to RedHat throughout.

So please check now with your dev teams to ensure that your software is ready to go and the software stacks continue to run as intended.

15th August 201715th August 2017

New Teramem systems now live !

2 new teramem systems are now available through the teramem queue on farm3. These new hosts are quad socket 20 core units (80 cores in all) and provide 3TB of memory on each host. This is a significant boost to our existing hugemem environment where we continue to provide 256GB, 512GB and 1.5TB systems.

In addition to the high core and memory that these new bits provide, they also have approximately ~2TB of NVMe mounted under /local/scratch01. This is a very fast local high IOP/s storage area that is idea for creating graph indexes, or general small file transactions.

To access these hosts, jobs will need to be submitted to the termed queue (-q teramem) and only jobs > 750GB will currently be accepted into this queue. Currently the maximum job length is set to 15 days and the new kernel required to support the systems does not support the blcr checkpointing at this time. So please be aware, the that this restriction exists.

As always, any questions or comments, please let us know in the usual fashion.