The new year began not so pleasant: I woke up to multiple emails from Jetpack’s “Monitor” feature telling me that all of the sites in my hosting account were down (followed by the “still down” messages one hour later), and two emails from Siteground (my web hoster), the first warning me that I’m nearing the CPU limit for the hosting account, and not much later, telling me that my account was limited because of resource overage. Not funny.
So, instead of priming the fence outside our house, I went on to investigate, trying to get the sites back online. Oh, the joys of New Years Day. 🙂
What’s going on?
Now, running four WordPress installations (all completely up to date, including the plugins) but having your hosting account limited isn’t exactly going to get you very far, because you can’t login to WP admin of course. I logged in via FTP and all the data seemed to be there and genuine (no alterations to index.php etc.) – so apparently, no one got in. Good. Had some plugin or script gone haywire on December 31st with the date change?
I contacted Siteground’s support and they reinstated full functionality of my account so that I could look into things. They provided me with an extensive list of things to do and try to reduce CPU load but since the site was running fine just 24 hours ago I dismissed that for the moment.
On my site, I couldn’t find anything that would cause any trouble, so I went over to Cloudflare to see what came up – and the statistics revealed what had caused the resource overage: a hefty increase of requests on the evening of December 31st, going on all the way through the night until Siteground limited my account:
At this point, one has to wonder of course: how can it be that no alarm or whatever goes off at Siteground when an account’s average activity suddenly spikes like that? And why did Cloudflare let all these requests pass through? I would have expected that some kind of behavioral analysis smart-something would have kicked in, especially since the majority of the requests (~26000) came from South Africa, and another large amount from South Korea (~8000) – not exactly two countries that score very high on my visitor list usually.
I went on to work on Siteground’s list of things to do in order to ease the load on the server, since everything still felt quite slow and sluggish after they reinstated my account. Among them was a screenshot of the most executed scripts in my account:
Script #1 (wp-login.php) is obviously what the blokes from South Africa and South Korea were targeting, trying to get in. It’s simply a brute attack on WordPress’s admin login page.
Script #2 (index.php) is essentially just the homepage (index.php in the root directory loads WordPress). Again, 2571 hits for the homepage in the wee hours of the new year is a bit much, and I have really no clue why the home page saw this increase in requests as well (but well, what do I know about WordPress…).
Script #3 (wp-cron.php) is an interesting one, and Siteground recommended an important change to my setup, namely to replace WP-Cron with a “real” Cron job.
Simply put, WP-Cron is started every time a user visits the site, to check if there are any tasks (like scheduled posts) that are due for execution. It’s a crutch meant for hosting environments that do not have a real Cron (scheduler). That’s fine for small sites of course, but when you’re getting that many hits, it’s an absolutely useless waste of resources (are we there yet? are we there yet? are we there yet?). I did not know that, and I changed this per Siteground’s recommendation.
Also per Siteground’s recommendation, I changed WordPress’s “Heartbeat” configuration, with this plugin: Heartbeat Control – if you’re leaving your WP dashboard open all the time in a browser tab, chances are that admin-ajax.php is among your most executed scrips – and you should probably have a look at this;-).
Yet, even after doing all this, my site remained slow to load and sluggish in the back end. I had blocked all of South Africa from accessing my site at this point (temporarily, via .htaccess), and Cloudflare didn’t show any unusual traffic anymore – so what the hell was going on?
Turns out that somehow, the attack had triggered a race condition that included my WordPress theme (not sure about that one, it’s what Siteground said), their WordPress caching plugin, and Cloudflare’s caching. It sounded weird, but they switched my WordPress theme to “Twentyfifteen” temporarily and the site became responsive again immediately (simply because the CPU load on the server dropped by, like a ton). They then flushed all caches, switched the theme back to “Portfolio+” – and everything was fine.
In the end, I’m a bit disillusioned about all the preventive measures that I took, how they failed, and how the whole environment and setup is rather, erm, “unaware?”
- Jetpack’s “Protect” did not catch any of the ~17000 login attempts that sent my whole hosting account into overage as malicious. Say what? Isn’t that the purpose of the whole thing? It should have absolutely caught that brute force attempt at getting into my site!
- The real problem (a brute attack on WP-login.php) was not identified by Cloudflare or Siteground. Sad enough, but brute attacks on WP-login.php are quite common – and a premier WordPress hoster like Siteground (and its support staff) should by all means be familiar with it, and perhaps implement their own preventive measures to limit them.
- Last not least, some kind of smarter “sync” between Siteground and Cloudflare would be desirable. Something like Siteground telling Cloudflare “hey pal, this is a shared hosting account running WordPress, and <n> number of requests per hour will most likely push it to the allowed CPU limit.” – you get the idea.
Well, there’s always room for improvement! 🙂