Site was down today, support and web hosting.

MySQL Performance Blog - Sat, 01/09/2007 - 12:36am

During last one and a half year we had pretty good track record with MySQL Performance Blog - there were times when site was slow (especially when backup was running) but I do not remember significant downtime, until today we went down for few hours.

All this time the site was running on dedicated server which I rented from APLUS about 3 years ago. It is rather slow Celeron box with single disk and 512MB running Fedora Core 2. Despite its age (it was used “Value” server even when I got it) the server had very good track record with basically zero failures during this time - there were some network disruptions at Aplus but this is about all problems we had.

As the OS on the box become rather outdated plus server was old and had no RAID and I did not want to rely just on daily backups we got new server from 1and1 couple of weeks ago. Some people recommended them to me plus they had good price for hardware with decent base specs - 64bit CPUs, RAID remote reboot and serial console, backup etc. Plus we had one server hosted with them in Europe for couple of months for tracking European traffic in ClickAider and it work reasonably well.

This time I got less lucky and in about a week after we moved MySQL Performance Blog to the new server, it stopped responding. I went to the control panel and reboot the system - it does not come back with no messages at serial console at all. Worst of all it gets into situation when It thinks reboot is in progress forever so I can’t even boot it again to rescue mode.

I call 1and1 and explain them the problem. The guy checks the system by rebooting it in rescue mode and back to normal and as it does not boot in normal mode he tells me his dedicated server team has to take a look at it and I should expect answer from them in 3-4 hours. Come one! 3-4 hours before anyone even starts looking at your problem this is as good as never for any passionate online business. The guy also tells me I can write to the Server Support team by email and they should get back to me quickly - as you may guess I’m yet still to get reply from them.

I call again in a few hours and get to another guy. This one he tells me he can’t reboot the server and I should try to boot it in “Last known good configuration” for which I point him it is not Windows. He tells me “OK you’ve got to run hardware diagnostics when while in rescue mode” ok, I ask him what command should I run - he tells me to run “fdisk -l” (which lists partitions) I ask him to spell that to me carefully and then politely ask him to pass me to someone less clueless then he is. He refuses to do that (silly as I just can call again) I ask him to talk to his manager and he also refuses sending me to nameless complain service (which I would imagine goes directly to trash).

A side note: This actually may be the worse part. In organization your stuff members can be wrong or there may be misunderstanding with the customer so passing to the different guy or to the manager is a must for any reasonable customer service. This was the case in MySQL and Tom Basil had a magic of calming down most of rare offended customers. In our Consulting Work we also follow the same principle - if customer requests second opinion he always gets it.

So what is the best way to deal with clueless support stuff working for big companies/big call centers ? Of course call again. I call again and get to the different guy. This person is not so clueless which is good. Though now they can’t find any record of a call I made 3 hours ago. Anyway he goes ahead and comes back with same result - Server Team has to take a look at the server and this time I get even better time estimate - tomorrow and there is nothing he can do other than escalating the case in the system.
As you may guess my blood is boiling at this point.

The next joke comes just 10 minutes ago - An email from 1and1 about closed case asking me to tell them how happy I was with service provided. How does one suppose to feel having his case solved without problem being resolved and which feedback one would provide. Of course there may be internal ticket created for mysterious “Server Team”.

At this point I did not expect any help to match my timing so I went ahead and returned web site back to old server which happily was still available. Now I will wait just for sake of interest how long will it take 1and1 to finally solve my problem before canceling their service.

Request for Advice: As this move did not work I’m looking for other hosting location in US I would either rent 2-3 servers or best get some rack space and buy my own servers instead.

P.S Some may tell me I’m just paying for being cheap. Well it is true I’m trying to get good value at good price,
and it really works in most cases. I’m very happy with Aplus Value Server which we’ve been used as well as with our racks with Hurricane Electric and Black Lotus both are not perfect but pretty good value for the price. Hostik on other hand was absolutely horrible.