The MU forums have moved to WordPress.org

performance problems when users create lots of pages (7 posts)

  1. quenting
    Member
    Posted 16 years ago #

    some of my users are using "pages" alot, rather than articles. They create like, 200, 300 pages.
    I've found that those requests serving those blogs are *much* more CPU consuming than the regular ones for blogs using mostly articles. Some of these blogs are getting very popular on my platform, and at peak these blogs alone are taking servers down (i've even tried putting just one blog on one brand new machine and it fell in minutes).
    Has anyone else seen this issue as well ?
    Any hints on a potential solution ?
    I suspect this could be due to WP's "rewrite rules" system."It seems like WP needs only 1 set of rules for all articles but 1 set of rules per page.
    This being all in a serialized object (baaad) and i feel it could be the problem.

  2. quenting
    Member
    Posted 16 years ago #

    so i've investigated a bit more.
    For the users that create the problems, the "rewrite_rules" option is a 1.5MB string (yeah you read well). Serializing/Unserializing such things is probably what makes things slow.
    Any idea how to solve this ?

  3. drmike
    Member
    Posted 16 years ago #

    Whack your users and teach them that a blog means Posts and not Pages? :)

    Only thing that comes to mind right off is make sure the object cache is turned on and working.

  4. lunabyte
    Member
    Posted 16 years ago #

    Q, not a thought. Granted, I just read your post.

    Without stripping out the entire code for it, and really munging things up, I have no idea at the moment.

    Mod_rewrite, in it's purest form, should be able to take /this/that/ and turn it into whatever from the database. It can, but then you get into things like permalink style and mixed variables. Someone wants /%category%/%postname%/, someone else wants /%postname%/, some other Joe wants /%category%/%postname%.html and so on.

    Working on a side project at the moment, and just happening to be in the area of permalinks at the moment, the solution we came up with for our need was this:
    pages: /pagename/
    articles: /articles/category/title/
    news: /news/category/title/

    etc.

    That works for our uses, but obviously not for WP, or MU. Our whole structure essentially relies on the first /value/. We're keeping an array of values to check against (articles,news,downloads,etc), and then if nothing matches we look for a page with /value/ as its slug (filename, or whatever).

    But, since most WP users out there blog with pages as a second thought, that approach won't work.

    Looking at my own rewrite rules for my blog, I see a lot of extra stuff, like attachments for every page type of rules, etc.

    IMHO, they look mostly unnecessary, unless there would be an attachment associated with that page, but I'm not going to say they are. I just haven't dug up much on it.

    However, I will confirm and agree that blogs with pages process slower. Visually they don't appear to, although I don't have a blog with more pages than my own at the moment (about 20-30), so it's tough to say for me.

    What's the real difference for pages anyway? At a very fundamental level, nothing other than the default page template and the fact that by default the links to them generated by WordPress don't have /category/ (or whatever) added to the front.

    To me, completely rethinking pages, to make them posts but simply with a different template, would probably be something to consider. The only real thing to look out for would be sub-pages, but still it would seem that initially looking up category names and comparing to /value/, and then looking at page slugs if nothing found would be an option.

    As long as a page and a post can't share the same slug, there shouldn't be a problem. IMHO, I'd rather do 4 small, additional reads from the db (OK, actually it would be more like 2 as they would be stored possibly, or whatever) than have to process that large of a rewrite chunk.

    I wonder if WordPress.com has that issue, and what's has been considered. With all the users they have, I can't believe that one of them doesn't have a ton of pages.

  5. quenting
    Member
    Posted 16 years ago #

    well, i agree pages should be like posts, and don't see why we have an extra set of rules for each of them. Probably the fact there can be a hierarchy comes into consideration with the /parentpage/subpage/stuff URL layout. Personally i wouldn't care less if subpages didn't have the parent page in URL but well.

    Anyway, I think i solved this issue (at least temporarily).
    Basically looking at the rewrite_rules option, you can see that for each page, there are like 10 rules stored. trackbacks, feeds (all sorts), attachments, attachments trackbacks, attachments feeds, attachments' trackback's feed attachments and whatever.

    I considered the fact that ther attachment URLs had to be known to be accessed (by default just the attachment is open when you click an image/file), that most of my users didn't know what a trackback was, and that 99.9% of feeds used were the main page of the blog's ones or the comment ones, I basically hacked rewrite.php to get only 1 rule per page. The 1.5M have become 45k, and my servers can breath again. God I hate serialization.

  6. tmuka
    Member
    Posted 16 years ago #

    hi quenting,

    i'd be interested in your hacked rewrite.php if you dont mind sharing. It sounds like i share the symptoms of your issue.

    Thanks!
    -tony

  7. quenting
    Member
    Posted 16 years ago #

    sorry I won't post my rewrite file because i branched wpmu too long ago for you not to risk problems using it as-is.
    But the main change i did, if i remember well, was adding ifs around line 672 :

    if ($post) {
    //add trackback
    // quentin: more simple for pages
    if ( ! $page )
    $rewrite = array_merge(array($trackbackmatch => $trackbackquery), $rewrite);

    //add regexes/queries for attachments, attachment trackbacks and so on
    if ( ! $page ) //require <permalink>/attachment/stuff form for pages because of confusion with subpages
    $rewrite = array_merge($rewrite, array($sub1 => $subquery, $sub1tb => $subtbquery, $sub1feed => $subfeedquery, $sub1feed2 => $subfeedquery));
    //quentin: simplify this mess
    //$rewrite = array_merge($rewrite, array($sub2 => $subquery, $sub2tb => $subtbquery, $sub2feed => $subfeedquery, $sub2feed2 => $subfeedquery));
    }

    I think there was something else along the lines of adding an extra parameter set to false to generate_rewrite_rules somewhere, but honnestly that's too long ago for me to be sure, try out to adapt the above first.

About this Topic

  • Started 16 years ago by quenting
  • Latest reply from quenting