Photographing London Photoblogging our life from London!+44 (0789) 499 2509
carlo@fchouse.com

When you manage a server: DO and DON’Ts

Yesterday (Sunday) someone in my office decided that the production server needed a refresh. Just “few” things, including Apache, MySQL and PHP. Not everything went smooth as planned (?).

Server Maintenance DO and DON\'TsLet me write down some DO and DON’T…

 

  • DO Prepare yourself. Before starting check that you have everything you need.
  • DON’T start if you do not have EVERY password you need.
  • DO Test your upgrade on a staging server
  • DON’T install anything on the production server without testing it before in staging
  • DO warn everyone involved: they may be using the server
  • DON’T hit and run, calling everyone because the server is not working
  • DO plan the upgrade in advance, being sure that if you need help someone can be there for you
  • DON’T upgrade on Fridays, Saturdays or Sundays, unless everyone is working for that upgrade
  • DO backup first

 

In the end, yesterday downtime was limited, and everything is working perfectly again… but from this point of view Fabiana is right when she says I am a little “Control Freak“:

If you want to do something, do it well!

When you will have years of experience, when you will have done many many mistakes, when you will have erased server hard drives without backing up data… you will avoid the DON’Ts without much hassle.

But sometimes it will happen. Sometimes you will forgot and someone else will hate you a lot for at least five minutes! :D

Share and Enjoy:
  • Digg
  • del.icio.us
  • Reddit
  • E-mail this story to a friend!
  • Facebook
  • Google
  • LinkedIn
  • Ma.gnolia
  • MySpace
  • Ping.fm
  • Pownce
  • Slashdot
  • StumbleUpon

3 Responses Subscribe to comments


  1. IT’s About Uptime - The StackSafe Blog » Links List 3.14.08

    [...] gives a few tips on managing a server, including testing upgrades on a staging server. The tips listed were followed, resulting in [...]

    Mar 14, 2008 @ 19:23


  2. Jonah Paransky

    Carlo,
    great points. The one thing I would add is the importance of finding a way to test the change not only on a staging server, but also against the end to end IT service that the server supports. This can be accomplished it by building a comprehensive staging environment of the entire end-to-end IT service or investing in on of the emerging tools that can help provide an equivalent functionality. Often downtime caused by changes are not based on the impact to the individual server, but to a components that other pieces of a complex, multi-tier application rely upon in the infrastructure stack.
    -Jonah Paransky
    (www.stacksafe.com/blog)

    Apr 03, 2008 @ 16:03


  3. Carlo

    Hi Jonah,
    I do not think to follow you completely (or at all)…
    What do you mean when you speak of “end-to-end IT service”?

    If I understand well, the reliability of multi tier application can play a DON’T role if the design of the system is faulty… so “DON’T use faulty design”? is this what you meant?

    Apr 03, 2008 @ 16:08

Reply


About
Free thoughts of Fabiana and Carlo, two stubborn idealists moved from Italy to London
Twitter