Services outage

Services were offline for almost eight hours this morning, from 2:02am to 9:46am EDT. Please accept my personal apologies for the outage being that long.

What happened?
The dedicated server hosting network services experienced a kernel panic.

Why was the outage so long?
Three things:

  1. The dedicated server's provider does not automatically restart servers when they go offline.
  2. The server monitor worked correctly but I did not receive the notice in a timely fashion.
  3. The server froze when it was rebooted.

How are we going to fix it?
I am working with the dedicated server provider to see if they can automatically restart the server if it goes offline again.

I am also working to refine the notification process to ensure that I receive and acknowledge any alerts in a timely fashion.

Finally, the dedicated server provider will be doing hardware diagnostics on the server starting at 11 am EDT on Thursday, 08 July. During this time, network services will be moved to another server so the outage will be kept to a minimum. There will be brief outages while services are moved to the other server and then when they are moved back after the diagnostics have completed.