Emergency maintenance to fix security vulnerability (CVE-2016-5195)

From ScientificComputing
Revision as of 11:42, 25 October 2016 by Byrdeo (talk | contribs)

Jump to: navigation, search

A recently published vulnerability in the Linux kernel (CVE-2016-5195) allows any user to get full control of the operating system. This is a critical security issue, which leaves us with no choice but to take BOTH Brutus and Euler OFF-LINE until the issue has been fixed.

Since we cannot exclude the possibility that someone already exploited this vulnerability, all login nodes and compute nodes will need to be wiped clean and their OS reinstalled from scratch, before they can be put back in production.

The reinstallation of the login and compute nodes will affect only system files stored in these nodes' local file system (/bin, /etc, /sbin, /scratch, /tmp, /usr, etc.). User data (/cluster/home, /cluster/scratch, /cluster/work, /cluster/project) do not pose any security risk and will therefore not be touched in any way.

At the time of writing neither Red Hat nor CentOS have released a patch for the operating system that we are using on Brutus and Euler. No-one knows how long this will take. Please refrain from submitting tickets or sending emails asking when Brutus and Euler will be back on-line. We will publish regular status updates on this page and notify all cluster users by email when Brutus and Euler are on-line again.

Thank you for your understanding

Updates

2016-10-25 13:30

Red Hat released a patch for RHEL 7 yesterday evening. It may take some time until they release one for RHEL 6, and then for CentOS to port it to the version we are using on our clusters (CentOS 6.8).

Our local kernel expert has therefore decided to write her own patch for CentOS 6.8, based on the information publicly available about the kernel's vulnerability. The cluster support team is testing it right now. As far as we can tell, it fixes the vulnerability, but we still have to make sure that the new kernel does not have any undesirable side effects. If these tests are successful, we will deploy it to the login nodes of Euler, and then progressively reinstall all compute nodes. That should allow us to (partly) reopen Euler while we wait for the official patch for CentOS 6.8.