Revision of todays server and service downtime

At around 04:34 GMT+2, most of our services went down, lasting until 08:30 GMT+2, which was when a hard reboot of the server was initiated because it was no longer responding to any action. The downtime went unnoticed for a while due to responsible persons sleeping.

The same thing happened again around 16:34 GMT+2, which was noticed much quicker and fully remedied at 16:50 GMT+2.

The cause for this downtime was ultimately caused by a kernel bug which caused the whole system to freeze. As direct action, the kernel in use has been reverted to the most current LTS version.

29 Likes

Hah! and there I was just complaining about occasional freezes. Might be a kernel issue after all if even the servers freeze like that!

1 Like