Yesterday’s Downtime

I apologize to all those who came to visit my site for the last 12-14 hours due to some problems with my website. I had received a email this afternoon and I quote:

Hello,

I am writing you to let you know that i have had to disable your sites on the (krunk4ever) FTP account because you have gone over your cpu usage limit of 60cp… you are currently at

105.76cp

I have turned your sites off using a htaccess file, so you can go to your account and look over your sites to see what plug-ins or code may be casuing this load. Here is some insructions that will help you find out what is casuing this load and how to strop it, let us know when you are ready to take care of this and we will remove the .htaccess file

Thanks!
______

So having no clue what the cause is, I went to check my logs. Do notice they said they’ll provide me with instructions to help me find what is causing the high cpu usage, but it’s not anywhere in that email. Some other information that would’ve been helpful would also be time of high cpu usage, which ip(s), which process, and which page(s). I was pretty much left in the dark at this point. I could still SSH into my account and I found the .htacess file they mentioned, but disabling that didn’t seem to get my site back up.

I sent a total of 5 support requests, with my last 2 being somewhat angry and annoyed at their support and as my last email states:

You’d think that when you take someone’s site down, it’s pretty urgent to help them resolve their issues. This being my 5th support request to you and almost 12 hours since my first support request.

I get a email this afternoon detailing that my site has been shut down because of some high cpu usage (some #s were thrown at me). I was told to contact you to help resolve the issue immediately. Well, it would’ve been nice if you could’ve given me some more details, i.e.

which process?
which page?
ip of user?
time of incident?

The email mentioned that instructions on how to find the issue would be included, but that was no where to be found in the email.

I mean, I’ve been a customer for almost 1.5 years and the rest of this years service has already been paid for. I’ve recommended people on forums such as Anandtech, SlickDeals, and FatWallet to try Dreamhost, despite people saying how you’re overselling bandwidth and space because to be honest, up until today I was an extrememly satisfied customer.

I’ve sent 3 high priority emails today. Submitted a OMG! EXTREME CRITICAL EMERGENCY! request, but still no response 12 hours later. Sigh…

I mean I just can’t believe how long this is taking. I truly hope the issues you took care of before mine were just as or more urgent than mine.

Sincerely,
A pretty disatisfied customer,
______

So I finally get a email back 13 hours later stating:

First off, I have removed the .htaccess file that ______ created, and your sites are back online. The file was /home/krunk4ever/.htaccess

Now, the initial reason for disablement was not because of a single incident, but rather an accumulation of cpu usage over the last couple weeks. We have monitoring that records how much cpu time was used by all processes and keeps a running total for each user, measured in cpu minutes.

The number that ______ quoted was the total number of cpu minutes used by your user over the last day, which was just over 100. We generally like to see shared customers staying at or below 60 per day.

As for ways to lower your cpu usage.. The best way to lower it by many times is to not dynamically generate every page, as wordpress certainly, and gallery likely, do by default. WordPress has a cache plugin which will generate static pages that are served most of the time, and those pages are invalidated and regenerated when they need to be. You should look to install this plugin. I am uncertain if Gallery has a similar plugin, but it might, and that would also be worth looking into.

Let me know if you have further questions.

Thanks!
______

A much more informational email and if I had received this 11 hours earlier, I might’ve been happy, but their response time was just too long for such a matter. I sent a email back asking why I wasn’t notified of this earlier, because you’d think shutting down a customer’s website was the last thing you’d do. They’ve known this for several weeks and without a word of warning to me, my site got shut down. I sent this as my final follow up email:

Thanks ______. If I knew it was something I could’ve done, I wouldn’t have been so annoyed. Turns out they also chmodded my directory to prevent read from all. I had to chmod it back to 755 from 751 before my site stopped throwing the 403 Access Denied error.

I’ve downloaded and enabled wordpress caching as you suggested. I also turned on full acceleration on Gallery which is defined as:
Full acceleration gives roughly a 90% performance increase, but no dynamic data (random image block, other sidebar blocks, number of items in your shopping cart, view counts, etc) will get updated until the saved page expires.

and the time to expire is currently set to 1 day as recommended by that page.

Is there any way I can see what cpu usage I’m using over the next few days? Is there any way I can enable to receive alerts for these type of situations which I’d like to avoid in the future.

If there’s no automatic way, I’d appreciate if you could keep an eye out for my account of the next week and let me know what’s up and if I need to increase my cache time or something. I would want to avoid the mess that we got into today at all costs.

Thank you.

Sincerely,
______

I went ahead and installed WP-Cache 2.0 and enabled it (do remember to disable gzip under Options -> Reading). However, I was getting a weird problem where every uncached page would turn up blank, but the 2nd load onwards would work fine. The blank pages were starting to annoy me and I disabled it almost immediately and said screw it, if DH cuts me off, I’ll go find another host. Dereks’ been trying to get me to switch over to his hosting anyway. However, I decided to search around and see if there was a fix. On the same page in the comment section, there were a bunch of people with similar problems. Turns out this was an issue only apparent in PHP5; WP-Cache 2.0 works fine in PHP4. Reading further I saw this comment:

To fix the blank page issue modify wp_cache_ob_end() in wp-cache-phase2.php to use ob_end_flush() instead of ob_end_clean().
Works for a setup using PHP 5.1.2, MySQL 5.0.18, WP 2.01 and WP-Cache 2.0.17.

– Jussi Vaihia

Apparently that fixed it. In wp-content/plugins/wp-cache on line 219, comment out //ob_end_clean() and replace it with ob_end_flush().

In Gallery, I also enabled Full/High Acceleration: Full acceleration gives roughly a 90% performance increase, but no dynamic data (random image block, other sidebar blocks, number of items in your shopping cart, view counts, etc) will get updated until the saved page expires.

Hopefully this’ll fix the cpu over usage problem.

2 Replies to “Yesterday’s Downtime”

  1. Yikes! I’m on dreamhost, too, and I’d been using about 200+ minutes for a while (you can check how many you’re using in the logs/resources/ dir). It turns out that using gallery to display thumbnails for 70+ images a page ate resources like crazy. I switched to all static links, and that’s brought my usage down to about ~2000 seconds a day.

    I’d been getting that blank page thing too. Hopefully I’ll be able to use that plugin now :).

    Sucks that they just shut you down like that. I thought they sent warning emails first.

  2. OMG, thanks for finding that WP-Cache plugin! I’m also on DH (and a WordPress user) and I’ve been going over on my CPU time as well. This should help a lot.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.