The MU forums have moved to WordPress.org

One blog getting hit by spider, 200k today (10 posts)

  1. Konstan
    Member
    Posted 11 years ago #

    So as of a few hours ago one of my blogs got hit with something. The url in target is GET /theblog/feed/ HTTP/1.0

    I have Statpress running on that blog and its informing me that its a spider called Wordpress (what the hell). Its been hit 200k times in the last 2 hours :S

    I've tried searching for that bot but seems like it doesnt exist, so I dont really know what the hell.

  2. Konstan
    Member
    Posted 11 years ago #

    Ok, seems like I found the problem:

    The blog owner is using the default wordpress permalinks (not the pretty ones), so the feed url should be something like /feed=rss2, but instead he added a widget for the RSS as /blog/feed and Wpmu went crazy, it started an enless loop it seems.

    I removed the widget and the loads stabilized and everything is back to normal. What could this mean? How can I fix this for the future?

    Edit: ok, seems like when the user added the feed to his sidebar, it was calling the main blog page since the feed was not at /feed which caused a loop of hell. I changed his parmalinks to the pretty permalinks and its fixed. BUT when you click on the feed icon it breaks the permalinks and goes back to the default /?p=xxx, but the option is still checked in the admin panel.
    I am using wpmu 1.5.1

    Edit 2: when adding /feed it works, but /feed/ will just reload the page breaking the pretty permalinks. Its only this blog that does this.

  3. cafespain
    Member
    Posted 11 years ago #

    Have you got wp-cache installed?
    Clear the cache and see if that makes a difference.

  4. Konstan
    Member
    Posted 11 years ago #

    Nope, no cache.

  5. Konstan
    Member
    Posted 11 years ago #

    Could this be a widget bug?

  6. andrea_r
    Moderator
    Posted 11 years ago #

    "Could this be a widget bug? "
    and
    "he added a widget for the RSS as /blog/feed"

    Yep. The RSS widget included it's to put your *own* feed address in at all. It's for outside ones. Or at least plunking in the full URL.

  7. Konstan
    Member
    Posted 11 years ago #

    I removed the option to change permalinks to prevent the auto loop in the future. That blog ran about 15-20 million queries in the few hours the loop was there...

    Maybe the RSS widget should have a check for this? I mean, I know its pretty much useless to put a feed of your own blog in your blog, but a common user might not :P

  8. absolutemg
    Member
    Posted 10 years ago #

    I just had this problem as well - we couldn't figure out why the server was getting slammed with GET / HTTP/1.0 requests - one after another after another in an infinite loop -

    I have pretty permalinks set up on all the blogs in the Wordpress MU install.

    We finally tracked the issue down by determining the IP address all the activity was coming from (mine) - and I paid attention to which sites in the Wordpress Mu install I was visiting in my browser.

    After we found this post, I found that 2 blogs I was visiting had the built-in RSS widget set up with "incorrect" feed addresses as such - (using the same subdomain and domain as the blog it was installed on)
    a. http://subdomain.domain.org/feed/ (with the trailing slash)
    b. http://subdomain.domain.org (without the '/feed')

    Although I believe the issue is fixed by adding the feed address into the RSS widget as such -
    http://subdomain.domain.org/feed (no trailing slash) - I instead opted to delete the RSS widget and replace it with the Archives widget (I was over-thinking it) in order to display some of the most recent blog posts.

    I agree with Konstan - the RSS widget should either have a check built in, or this issue should be resolved by some other means, because this literally brought the server to it's knees and ended up corrupting some tables in the database (probably caused by the server getting slammed while it was doing an update/restart). I unfortunately don't have the technical expertise to craft such a solution - Anyone?

  9. tdjcbe
    Member
    Posted 10 years ago #

    May want to bring that up on the regular wp trac. We discussed this previous and was told to do so:

    http://trac.mu.wordpress.org/ticket/852

  10. tdjcbe
    Member
    Posted 10 years ago #

    Oops, it was on the regular wp trac and punted back to wpmu:

    http://core.trac.wordpress.org/ticket/8910

    I do note though that the rss widget does detect if it can't find a feed or can't use it.

    edit: This is the function that does the checking:

    http://trac.mu.wordpress.org/browser/branches/2.7/wp-includes/widgets.php#L1803

    I would think that taking the current blog's url and seeing if it was contained in the url of the feed url that had been entered would be the way to go.

    I seem to remember that edublogs had an issue with this a few months back...

About this Topic