5 tips to cope with web server hardware replacements

Martin Brinkmann
Mar 6, 2013
Updated • Mar 6, 2013
Development
|
3

If you have tried to access the Ghacks website this morning you may have noticed that the site was not accessible at all. You either should have received an Apache page or a server not found page depending on when you did try to connect to the site. The reason for that was a failing hard drive that we had to replace on the server. Problem was that it was the root drive where all the sites are stored on which meant lots of migrating and unfortunately down time.

It took more than five hours to exchange the hard drive and import the contents of the old drive to the new. The majority of thing went well during that time, but some could have been better. I'd like to share five tips that help you cope with the situation in a better way.

1. Make your own backups

Ghacks is hosted on a dedicated server that I have rented from Wiredtree. Remote backups are created once per day and the server itself creates regular backups as well so that there should not be a reason to create your own backups, right?

I personally prefer to be safe in this regard and download backups regularly to my own computer just to have a recent version of the site locally available. That's not only great for development purposes, but also gives me reassurance that I can restore the site even if things with Wiredtree go down south. I do not really expect this to happen but I have been burned in the past by hosting companies and prefer to be safe in this regard.

The restoration today worked well for the most part but one smaller site that is hosted on the server as well would not load properly. It displayed a WordPress installation screen instead, and a quick check revealed that the import failed to import the MySQL database. I quickly imported the most recent SQL file I had from the site and lo and behold, the site started to work afterwards.

2. Announce the down time

I only announced the replacement on Google Plus which was an oversight on my part as I received several emails and notifications about the site's status during the down time. While I appreciate every message, as it can very well happen that I'm unaware of certain things going on, it could have been avoided this time by better communication.

I first thought about writing about the hardware replacement on the blog, but decided against it as no one would be able to read it anyway during the down time.  It is unlikely that the notifications will reach all readers of a site or blog though, and part of your task during a down time is to communicate with your readers to keep them informed about what is happening.

3. Do not become impatient

I have a tendency for impatience when my sites go down. I contact support and can't wait to get a response, and if it takes too long, I sometimes write another email.  Not knowing what is going on is a problem for me, especially since my livelihood is directly linked to the availability of the server and the sites that are hosted on it.

Impatience on the other hand can become a problem as it may keep support staff from doing their job - that is fixing the server - as they now also need to reply to your emails.

I also received word today that a second email that you write resets the queue position of your support request so that you may in fact wait longer for a reply than necessary.

4. Do not work on the server until everything is done

You should not modify anything on the server unless support has completed the task ahead. This can be an issue if you are impatient like I'm in this regard. One of the things that I noticed right away during import was that the IP addresses for each account were not set correctly. This led to sites not loading properly despite the fact that they were imported correctly already and should work.

While it would have been easy to assign the correct IP addresses to the accounts to get them to work properly, I decided to not do so as support had that on their to do list as well.

Many things can go terrible wrong if you tamper with data while someone else is working on the server. It is always a good idea to wait until everything is done before you make modifications to the server by yourself.

5. Test everything thoroughly

Even if things seem to work after hardware replacement, you may want to make sure it does. It is important to test various features of a site, e.g. search, the opening of pages, error pages, or the contact form to make sure that everything is working alright.

You can also ask your site visitors to report any misconfigurations or issues they may experience while using the new site.

Advertisement

Previous Post: «
Next Post: «

Comments

  1. Craig said on March 7, 2013 at 11:37 am
    Reply

    Virtualization. Where I work we run all of our websites (and other tier 1 applications) on a virtualized infrastructure. Should we experience a hardware failure on the server running the virtualized web server, the web server can be easily spun back up on another server.

  2. Transcontinental said on March 6, 2013 at 4:20 pm
    Reply

    Nice to read you back ! I guessed it was only — from the user’s eye — a matter of time. I think whatever be the domain we always learn from was was not scheduled.
    Reading gHacks 5/5 loud ‘n’ clear :)

  3. berrtie said on March 6, 2013 at 3:22 pm
    Reply

    Shouldn’t two hot swappable HDDs mirrored in raid 1 be the minimum option for a dedicated server given that drives are the most likely component to fail?

Leave a Reply

Check the box to consent to your data being stored in line with the guidelines set out in our privacy policy

We love comments and welcome thoughtful and civilized discussion. Rudeness and personal attacks will not be tolerated. Please stay on-topic.
Please note that your comment may not appear immediately after you post it.