Day to day activities

You are currently browsing the archive for the Day to day activities category.

Over the first two days of November 2018, a substantial number of staff at the University of Reading became locked out of their user accounts meaning that they had no access to centrally provided University resources such as their desktop, Eduroam (Wi-Fi) and email. Over the course of the two days, IT saw approximately 500 individual incidents of this. 


The cause of this was an attack on University accounts through a legacy service providing email access to a small number of accounts. This service, known as IMAP (Internet Message Access Protocol), is used by some older email clients to gain access to email stored on central email servers. Most clients at the University do not use this method but it was once very common, and some systems still use it to get access. 

 A botnet is a number of Internet-connected devices, each of which is running one or more bots. Botnets can be used to perform distributed denial-of-service attack (DDoS attack), steal data, send spam, and allows the attacker to access the device and its connection. 

A “botnet” was attempting to connect to this system using a real username (e.g. ab123456) and then randomly guessing a password. As a defence against this, central authentication services will lock the account to slow down the attacks. This is a standard approach to reduce these attacks and forms best practice. The account is locked for a period of time and will then unlock to allow the end-user to regain access. During the time that the account is locked, the user cannot access services. 

Once identified, IT blocked access to the computer being used to launch the attack at the University Network Perimeter (known as a Firewall). Unfortunately, because a large and random number of computers can make up a botnet, these switched to a different source machine and started up again. Infrastructure Services were effectively playing whack-a-mole to stop the problem. In total we blocked 25599 different addresses during the course of the attack 


Due to the small number of end-users using the legacy IMAP service, and the larger number of users affected by the lockouts problem, IT took the action to remove external access to the IMAP service. This will remove the ability of the attackers to access the service and lock the accounts. University users of the external legacy IMAP service should use the email web portal to access their emails, calendar etc and contact the IT Service Desk for further advice. 

We will continue to monitor the situation as always for additional problems. 

This morning we identified an issue with the Card Finance system that is preventing users from logging on and is also affecting users trying to print.

This is being investigated by the supplier and we shall update with any further information as soon as possible.

Microsoft have reported an issue with Microsoft Teams that is preventing some users from accessing the service.  Please see the following service status notice from Microsoft.


Service degradation

User impact:

Users may be unable to access Microsoft Teams.

Latest message:

Title: Microsoft Teams access issue User Impact: Users may be unable to access Microsoft Teams. More info: Users may see the error message ‘D’oh! Something went wrong… Try again’. This is affecting both the Microsoft Teams web and desktop clients. Current status: We’re reviewing service logs to isolate the source of the issue and determine the next troubleshooting steps. Scope of impact: Impact is specific to a subset of users who are served through the affected infrastructure. Next update by: Monday, October 29, 2018, at 11:00 AM UTC



The research data storage service is now accessible. We are now running on one node and the service is considered to be at risk while we and the suppliers work to determine the root cause.

We will be getting back in touch with people who have already  raised tickets. Please contact us if you are still experiencing problems with the research storage service.

Once again I would like to apologise for the inconvenience this service interruption has caused.

Please note that following on from the issues we experienced yesterday afternoon, the NFS and SMB shares on stor-nex-pool1 are not working. This is affecting the Gold and Basic Research storage.

Our Academic Computing team are currently investigating this as a priority and we will update you with any information as this becomes available.

Last night we upgraded our storage to fix a known bug with moving NFS services between storage nodes.

The NFS service was not running on the storage which manages the exports for mail and web server configuration.

We recovered service by restarting the NFS service and are currently working with vendors on root cause.

On 5th October 2018 we experienced a loss of network connectivity to the RUSU building, which also affected the connection providing Wi-FI Connectivity to the Open Day dome. Working on the Friday night and on Saturday morning, IT and the Estates team were able to restore connections and restore a full service 

At approximately 6:30pm on 5th October, there was an issue that caused an interruption to service between network cabinets in Black Horse House – we know this is a fault with the cabling within the building (“structured cabling”), but the root cause is still being identified.  

This caused an outage at RUSU that affected all network services, including Wi-Fi and the tills. Taking place on the evening of the Fresher’s Ball this casued a substantial impact, and RUSU was unable to accept card payments. The IT networks team responded but were unable to implement a fix. They left at 11:30pm. 

On Saturday (University Open Day) it was identified that this would also have an impact on the Open Day Dome. Some connectivity via leakage from buildings was possible and IT were able to improve this. Working closely with Estates, a new connection direct from the IT datacentre to RUSU was brought into service. This new connection had only been completed on the Friday and was due to be made live shortly. This restored service to RUSU at about 1.30 pm and boosted the signal to the dome. 

Thanks to members of IT and Estates for giving up their Friday nights and Saturdays to fix this.



Thursday 4th October at 19.00 a fix was applied to the ACT system to resolve the underlying problem with the insights database.

Certain commands run against the storage were causing high load to the insights database and consuming memory.

Prior to the fix we were consuming on average 9-10GB of memory. This was hitting the limits of memory for the service.

Post fix, we are now consuming 40mb of the 10GB memory limit.

This has been achieved by creating a cache of the database with static data, rather than accessing the dynamically changing database.

We receive around 50 requests a second and these were taking 1.5 second to respond. These requests are now being completed in milliseconds as expected.

We will continue to monitor with engineers from our supplier.

The service used for Research Data Storage consists of two key elements; the underlying storage system itself ‘ADFS’ (i.e. your data), and an insights database which contains the metadata which is associated with this data.

The insights database polls the underlying storage system at regular intervals to identify changes to data since the last poll. This includes new, updated and removed files. The database then records where on the underlying storage system the data is held and the usage against any quotas that are in place.

Prior to the issue yesterday we saw a large number of files deleted from the underlying file system (around 4TB) this caused high load on the insights database. The database then ran out of memory causing the database to crash. The database recovered immediately but continued to under perform with the volume of changes it had to process.

We are working with the supplier of the storage system, to identify why this change caused the issue with running out of memory. We have increased the amount of memory the insights database can consume to mitigate this issue until a permanent fix has been put in place. Our supplier is working on this issue as a matter of priority.

We are actively monitoring the system along with our supplier who are monitoring remotely.



We have now resolved the issue affecting some users being unable to access their N drive or web pages.

A fix was implemented on the file storage system that was causing this problem and will continue to monitor these services.

We have reports of some users being unable to access the N:/ Drive and we are working on implementing a fix.

This is due to an issue with the file storage system which is also causing some issues with web pages being unavailable.

We are working on this as a priority and will update as soon as we have any further information.



The following services have now been restored and should be accessible from 2pm this afternoon. Please consider the services to be at risk as we continue to monitor them.

  • NX Linux Desktops
  • Met-Cluster
  • Free Cluster
  • Select VMs on the Research Cloud
  • Met webserver
  • Select Research Data Storage Silver Shares
  • Computer Science Linux Desktops
  • SMPCS X-Drive
  • Unix home directories

We will send a further more detailed update message tomorrow morning at 10am.

Once again we would like to apologise for the inconvenience this has caused.


Engineers  in America are continuing to work on the issue and have diagnosed a likely cause which they are working to resolve. We expect this remedial work to be complete by 2:30pm but will monitor the situation and if we can restore the service sooner we will. Once the service has been restored it should be considered to be at risk until we have a full diagnosis of the cause.

We will provide a further update by 2:30pm or sooner.

There is a system issue affecting the storage systems. Supplier support is currently diagnosing the cause and as a precautionary measure we are preventing further access to these systems.

Affected Services include:

  • NX Linux Desktops
  • Met-Cluster
  • Free Cluster
  • Select VMs on the Research Cloud
  • Met webserver
  • Select Research Data Storage Silver Shares
  • Computer Science Linux Desktops
  • SMPCS X-Drive
  • Unix home directories

We are continuing to work with the suppler and will provide a further update at 12:30

We would like to apologise for the inconvenience this incident has caused, we are working to resolve this as quickly as possible.


N Drives were inaccessible this morning due to an IT system issue that has now been resolved.
N Drives and all other services are now back up and working fully. The issue occurred over the weekend and was resolved by 9:30 this morning.
IT is working with the supplier of the affected system to make sure services remain fully operational.
If you need further help or assistance please contact IT.

« Older entries § Newer entries »