The MU forums have moved to WordPress.org

Reviving the performance thread (28 posts)

  1. dsilverman
    Member
    Posted 16 years ago #

    There don't seem to be any recent (< 1 year) threads about performance and tweaking, so I'm wondering if anyone has thoughts or experiences on improving performance for high-traffic sites. Note that my site is not large on blogs (we have around 500, maybe half of which are active), its just bad on performance.

    The machine I'm using is a dual processor 2ish Ghz with 4GB RAM and fairly speedy disks. Database is on another server with similar specs, they're talking to each other over GigE. I'm finding that using ApacheBench I can get somewhere in the range of 20 req/s loading the home page, which is fairly static. On a popular blog I get closer to 3 requests a second. Using eAccellerator today I was able to double those numbers most of the time, but at the cost of fairly predictable segmentation faults.

    Unfortunately I can't compare these numbers against our current live site, as it generally has a processor load somewhere between 20-100. I'm hoping that I'm missing something obvious that will make things work substantially better, since I can't imagine large sites like wordpress.com are getting numbers this bad.

    Just off the top of my head, I have a MySQL that has fairly aggressive query caching, I've tried with and without the built-in WordPress object caching mechanism with not much noticable change, and I've tweaked Apache settings like KeepAlives (on and off), number of spare servers, etc.

    Are others of you getting comparable performance numbers? One thing I've noticed is that I can take the concurrency up pretty high and still get the same results, I'm not sure what that suggests about where the bottleneck is. ApacheBench output below.

    [zeno@faith ~]$ ab -c 10 -n 200 http://blogs.law.harvard.edu/mediaberkman
    <snip>
    Document Path: /mediaberkman
    Document Length: 80735 bytes

    Concurrency Level: 10
    Time taken for tests: 55.708683 seconds
    Complete requests: 200
    Failed requests: 0
    Write errors: 0
    Total transferred: 16246000 bytes
    HTML transferred: 16147000 bytes
    Requests per second: 3.59 [#/sec] (mean)
    Time per request: 2785.434 [ms] (mean)
    Time per request: 278.543 [ms] (mean, across all concurrent requests)
    Transfer rate: 284.79 [Kbytes/sec] received

    Connection Times (ms)
    min mean[+/-sd] median max
    Connect: 0 0 0.3 0 1
    Processing: 1676 2722 337.8 2709 3948
    Waiting: 1652 2685 335.9 2683 3922
    Total: 1677 2722 337.8 2709 3948

    Percentage of the requests served within a certain time (ms)
    50% 2709
    66% 2861
    75% 2948
    80% 3017
    90% 3131
    95% 3254
    98% 3545
    99% 3646
    100% 3948 (longest request)

  2. dsilverman
    Member
    Posted 16 years ago #

    Oh, sorry, I wanted to end with an "any help appreciated," because it is. ;)

  3. lunabyte
    Member
    Posted 16 years ago #

    This is a tough one, as there is so much that can take a toll on performance.

    Things like:
    - too much "crap" in mu-plugins that doesn't really need to be there
    - too little resources allocated in mysql.conf
    - too much additional code that's unnecessary
    - too little memory allocated in php.ini
    - too many open ports / running a lot of email on the same box
    - large page sizes (too many images, videos, JS, whatever)
    - and a ton more.

    Mu-plugins is a big one that is frequently overlooked. Anything dropped in there will be loaded and parsed (not necessarily executed) on every single page load.

    This includes plugins which are admin only as well. One "fix" I do on my own site(s) for this is to add in similar code that calls mu-plugins but add it into the admin area instead (before it starts parsing pages). Then, just drop "admin area only" plugins that would have gone in mu-plugins in a special plugins dir I create inside wp-admin.

    For other things, I also "fix" up a special plugins dir that loads like mu-plugins, but only if the blog_id equals 1 (main site). Things like the sitewide feed, and a few other plugins I only want available to the main blog are stored here (hence, blog_id equal to 1).

    Then there is mysql.conf. That "calling all large hosts" thread has some good info on configurations for it, and if I recall correctly off the top of my head it was based on a 4G RAM availability.

    Then there is overall page size too, which may not show in a network test like that, but it will to an end user.

    Oh, and I recall Quenting mentioning something about pages and rewrite rules being atrociously large in size, but he didn't mention what he fixed on it, and I haven't dug into it myself yet. However, it was something along the lines of rewrite rules being about 45M in the db, and most of it unneeded. Said after cleaning it up, it sped up his site quite a bit.

    I'd say WP-Cache, but I've had mixed results with getting it to work with MU.
    It can be done (cough, Donncha, /cough), but I didn't have the time to spend on it to get it working.

    Also, optimizing the DB on occasion is a good idea too.

  4. lunabyte
    Member
    Posted 16 years ago #

    Oh, and site wide plugins that query across all blog tables with queries within queries will eat up plenty too. Especially if it isn't cached right.

    The site wide feed uses cache, so while it's a large query, it's only performed no more than 4 times an hour.

    There are "plugins" out there on wpmudev that definitely aren't large traffic friendly, so be careful on that end as well.

  5. Farms
    Member
    Posted 16 years ago #

    Split the database! Makes an *enormous* difference.

  6. SteveAtty
    Member
    Posted 16 years ago #

    I would have thought the obvious areas for performance gain are:

    1) SQL Queries - Ensure you only bring back the columns and the number of rows you need. Don't do a

    Select *
    with a loose where statement and then only process 2 columns and the first row. It may be that adding some indexes and reducing full table scans will really help performance on big systems

    2) Tighten up the PHP code if possible. The "admin" plug-ins directory sounds a great idea and anything that reduces parsing can only be good.

    3) Trim out the bloat. I know that bandwidth and things like compression make this less problematical than it used to be but compression takes up server resources (and client resources to decompress). Has anyone looked at the side of some of the JavaScript files? Prototype.js is 94Kb

  7. quenting
    Member
    Posted 16 years ago #

    some of the most effective changes:

    - Caches. Most importantly, mysql query cache and php cache. Installing APC in itself provided a 40% instant boost accross my servers. Best 2 hours ever invested in solving performance problems. Just be careful anout kses.php that you need to disable to avoid segfaults, and give APC enough ram to cache all files (about 50 megs per MU install for me). I'm having mixed results with object cache (ENABLE_CACHE). Less context switches and sql query, but not much improvements in terms of overall cpu usage. Haven't tried yet the WP_CACHE thing.
    - Mysql Config. download this http://forums.theplanet.com/index.php?showtopic=64695&pid=573659&st=0&#entry573659 and run it after mysql's been up for a couple of days. It'll tell you if your query cache, key cache etc. are setup correctly. You'll need to tweak it a bit if you have more than a couple thousands blogs though, or your box might choke on some steps (it does a "find" on the mysql directory to calculate total index size for instance).
    - apache config. Monitor your open slots, and adjust your settings (most importantly keepalivetimeout) to keep the number of total open / idle to acceptable numbers.
    - image cache. Uploaded images are a pain in MU because everytime they're accessed it means php+sql queries. If some people start hotlinking your images on popular sites, that can really kill you. If you have a multi-box setup, think about using a reverse proxy and cache the contents of /files on the front machine. They can be cached "forever" because even if the iamge is re-uploaded, it will have a different name/path. With a single box setup, you might want to try just running mod_cache without mod_proxy and see if it helps somehow (probably will be less effective though).
    - same comments as above regarding plugins.
    - if you have a stats plugin and/or a who's online plugin, tweak them to death. Those features are by far the most consuming ones in my setup.
    - if you use SK2 (who's not ?), tweak the order of plugins so that the number of checks is optimal (put the most effective ones first).
    - be careful with sitewide features. if you develop some, use global tables. Looping through blogs is not a good idea, even as a cron task.
    - as mentionned above, the rewrite_rules used by wordpress are not very efficient if some of your popular bloggers use lots of pages. I've seen the size of the rewrite_rules option reach 50 megs for blogs with loads of pages. If you get rid of the rules for "page's attachment's trackback's rss" and the likes, down to the only ones that are ever used, you can save a lot of disk I/O.
    - for god's sake cache the output of tinymce in a single js file, or at least make sure you add the header options that make the file cached on the client. I'm not sure if that's part of the releases now, it wasn't a while ago, and the huge 200k file was redownloaded over again. So check those http headers with a tool ala fiddler. Caching the file (server side) is a good idea too (otherwise it re-reads/re-packages the plugin files into a single one with every download).
    - Split the DB. If any of your DBs reaches 1000 blogs (30000 files), split it. And again.
    - If you have a dedicated box, make sure it uses NPTL and not linuxthreads. I had 2 boxes that were 20% slower than the others due to that. Once i figured it and corrected it, it went to the same resources consumption level as the others (mysql is highly inneficient with linuxthreads).
    - disable http logs for images, or altogether if you don't care about stats.
    - hey, you should have some work to do already ;-).

  8. dsilverman
    Member
    Posted 16 years ago #

    Wow, excellent and numerous suggestions, thanks everyone! And quick, too. :)

    dsilverman: I'm finding that using ApacheBench I can get somewhere in the range of 20 req/s loading the home page, which is fairly static. On a popular blog I get closer to 3 requests a second.

    Yeah, I'm quoting myself. But only because with further searching it sounds like my AB benchmarks are inline with those of others.

    lunabyte: Mu-plugins is a big one that is frequently overlooked. Anything dropped in there will be loaded and parsed (not necessarily executed) on every single page load.

    This idea came to me this morning and I discovered through trial and error that the Z-Space plugin was having a major impact on performance. I don't want to dump it entirely, so if its a fixable problem I'll release a patch (haven't looked at the code yet), or at least move it into an "admin-only" plugin area, as per your suggestion. Very good general suggestion -- all the plugins one sets enabled by default in mu-plugins can be silent killers...

    lunabyte: there is mysql.conf. That "calling all large hosts" thread has some good info on configurations for it, and if I recall correctly off the top of my head it was based on a 4G RAM availability.

    I don't this this is applicable to me in this instance, but let's incorporate that thread by reference for others. As well, this thread looks quite useful.

    SteveAtty: SQL Queries - Ensure you only bring back the columns and the number of rows you need. [etc.]

    A good idea in general but I'd rather not modify the WP base SQL stuff, as its likely to change between versions and easy upgradability is important to making sure that upgrades actually happen and they don't break things. Same applies to "clean up PHP code" and "trim bloat" -- it would be best if possible (IMHO) to incorporate any changes of this nature further upstream.

    quenting: Installing APC in itself provided a 40% instant boost ac[]ross my servers. Best 2 hours ever invested in solving performance problems. [...] Just be careful a[b]out kses.php that you need to disable to avoid segfaults, and give APC enough ram to cache all files (about 50 megs per MU install for me). I'm having mixed results with object cache (ENABLE_CACHE). Less context switches and sql query, but not much improvements in terms of overall cpu usage.

    Yep, eAccelerator also gives great performance (I'm seeing about 2x overall) provided you disable kses.php there as well (which I discovered in the db thread mentioned earlier)

    quenting: if you use SK2 (who's not ?), tweak the order of plugins

    I was planning to move solely to Akismet for spam filtering, in part to cut down on load. Bad idea?

    quenting: image cache. Uploaded images are a pain in MU because everytime they're accessed it means php+sql queries. If some people start hotlinking your images on popular sites, that can really kill you.

    Not sure I understand this one. I just tried uploading an image and it put a direct file link into the blog post. Doesn't seem like it should be performing any queries or executing any code to let people see the image.

  9. lunabyte
    Member
    Posted 16 years ago #

    dsilverman:"This idea came to me this morning and I discovered through trial and error that the Z-Space plugin was having a major impact on performance. I don't want to dump it entirely, so if its a fixable problem I'll release a patch (haven't looked at the code yet), or at least move it into an "admin-only" plugin area, as per your suggestion. Very good general suggestion -- all the plugins one sets enabled by default in mu-plugins can be silent killers..."

    See my reference above for taking plugins that are only used in the admin area, and hacking in (it's really less trouble than it sounds) a special plugin dir in wp-admin to use for just those admin-only (meaning wp-admin area, not user role). I use z-space on one of my sites, and I have it sitting in the admin plugin dir I'm referring to.

    Same goes for any other plugin that would "normally" be in mu-plugins, but is only used in the wp-admin area. This saves an assload of resources overall, if you're using a lot of "wp-admin only" relevant plugins.

    Again, along this line, is the little addition to wp-settings, right after it sets up the mu-plugins directory. For things like the site wide feed, and other plugins I only want for my main blog, I drop them in a special "blog_id 1 only" plugin directory, and set it up to work like mu-plugins.

    This keeps special stuff only relevant to the primary site from being parsed when it's only relevant to blog 1.

    When it comes to high traffic, every little thing you can do helps. Whether it's separation of plugins as mentioned above, taking out useless queries, caching the output of a high load function to a file, and including the output html instead, or whatever. Which, brings me to another thing I do quite a bit.

    Let's use tags as an example. Even though it's on a separate DB on my site (implementing the Doc's overall method), it still eats up its fair share of resources when the tag function is run. So, what I've done is I only run the actual tag function once every 15 minutes (right after the new articles are pulled in from my site wide feed).

    I take that output, and instead of echoing it, I write it to an HTML file instead. Then, when I pull in the tags (either in the tag site or in my main site), I call that HTML file instead. It's included, printed to the screen, and all it well. A definite load savings right there. I also do it for other "similar" type site wide plugins.

    Whether it's recent members, posts, whatever the case may be. If it's a site wide feature, and eats a lot with its queries, you can bet I'm writing the output to a file and only running the actual query no sooner than 15 minutes apart. Some only run once an hour, and I have a couple that only run once every 4 hours. Just depends on how I feel is often enough depending on the data being processed.

    I'm in the process of building what will (at least it better) become a very high traffic application from scratch. Thanks to all the testing and evaluation I've done with MU, I'm able to implement some of the very principles we're discussing here into it. Like output caching, separation of loading various files for only the area needed, etc. Most of which I do anyway when coding from scratch, but if I have an idea, I have a decent traffic site that I can test behind the scenes with as well.

    In terms of traffic, I'm not really pushing a lot. My main blog draws most of it really, but that's how it goes. Even then, it isn't a whole lot compared to a lot of sites, but from where it was 24 months ago it's been quite a jump. I'd say that right now my biggest challenge is my user base. They're more of the "read and move on" type than the "participation" type. I'm changing it slowly, but damn it I'm in a hurry! Such is life though. :)

  10. quenting
    Member
    Posted 16 years ago #

    Yep, eAccelerator also gives great performance (I'm seeing about 2x overall) provided you disable kses.php there as well

    I use APC because i never found how to filter out a single file from EA, when it's just a line of config in APC. Is that now possible ? What's sure is that kses is the one causing the segfault trouble. Had to play with mayn accelerators to find this out, and *all* choke on that file, segfault and crash webservers. Not too sure why though :-p.

    I was planning to move solely to Akismet for spam filtering, in part to cut down on load. Bad idea?

    well, the first thing i dislike with akismet is the remote part of the thing, the second is the fact that it's mostly based on english filtering while my site's in french. Even with that, i might have been inclined to using it. But the one thing that's really not pleading for akismet is this:
    http://akismet.com/buy/enterprise/
    These price ranges might be good for corporate blog hosting and such, but if your audience is general public, and if your blogs are free, the price of akismet for 1000 users is the one of 5 good servers. Considering i have 100 times that many users, this is really not even an option to consider :-).

    Not sure I understand this one. I just tried uploading an image and it put a direct file link into the blog post. Doesn't seem like it should be performing any queries or executing any code to let people see the image.

    Well, look again. Image path on URL:
    http://blogname.domain.com/files/2007/06/image.jpg
    Image path on disk:
    /wp-content/blogs.dir/blog_id/files/2007/06/image.jpg
    This bloody blog_id thing in the file path is what's messing things up. There's no way for the web server to know the blog_id from the URL, hence to serve the file directly. It needs to go to DB, fetch the blog_id from the subdomain, then it's able to serve the file. In the meantime, all the php and queries that are run with every request are parsed/executed, including plugins.
    You can also see this in htaccess:
    RewriteRule ^(.*/)?files/(.*) wp-content/blogs.php?file=$2 [L]
    This means if your users upload lots of images, each page view with, like, 10 images is in terms of load like 10 concurrent page views. baaaad.
    If only the blogname was used instead of the blog_id, you could serve the file directly from apache. I wanted to change this in my MU, but finally the mod_cache solution is quite convenient, since it makes the calls be performed only once in a while, and doesn't involve core code hacking.

  11. quenting
    Member
    Posted 16 years ago #

    by the way, in the mysql settings in addition to the traditional and mandatory key_cache and query_cache stuff, one of the most important settings is the table_cache. MU creating a separate set of tables for each blogs, there's thousands of tables. Not caching them means that if you have lots of blogs, each access to a table that wasn't cached needs to re-push the index in memory. Lots of disk I/O, lots of RAM access. So push this cache up to whatever you can with the amount of ram you have access to.

  12. dsilverman
    Member
    Posted 16 years ago #

    quenting: I use APC because i never found how to filter out a single file from EA, when it's just a line of config in APC. Is that now possible ?

    Yep, as of 0.9.5 you can do:

    eaccelerator.filter="!*kses.php"
  13. quenting
    Member
    Posted 16 years ago #

    ok cool, that would have saved me some headaches a year ago. It's probably pretty much the same as APC performance wise anyway.

  14. dsilverman
    Member
    Posted 16 years ago #

    Want to once again thank everyone for their help -- launched the new site today and its incredibly snappy compared to the old. Some of it is upgraded WPMU code, some of it iis more RAM, and a lot of it is the tweaking.

  15. SteveAtty
    Member
    Posted 16 years ago #

    Re-reading this today I got to thinking about lunabytes comment about wp-admin only plugins. It does seem silly to keep loading/parsing code for every page if its only for the admin side.

    I also like the "blog 1" only concept for plugins.

    How easy would it be to have two new plugin directories to specifically handle these situations? Obviously the MU Plugins directory is MU specific code so could the core code be enhanced to support these two new plugin directories?

  16. lunabyte
    Member
    Posted 16 years ago #

    Um, yes. I do it myself.

    In both instances. Both the admin-only stuff, and plugins just for blog id 1.

    I posted up a hack for main site only plugins here, and searching for that phrase will bring it right up.

    But, yeah, to me it's a smart thing. It doesn't effect a single page load much, but when you start thinking about thousands or more, then every little bit helps.

  17. SteveAtty
    Member
    Posted 16 years ago #

    I know you do it yourself and it sounds like a very sensible thing to do - I was wondering if it was worth either writing it up or seeing if it could be pushed back into the trunk as part of the standard configuration.

  18. lunabyte
    Member
    Posted 16 years ago #

    I doubt it would make it into trunk, honestly.

    And there really isn't anything to write up. I already did.

    "Some" people say it can be done with a plugin, and it can.

    But that's still another file the server has to read into memory, and with it literally only being a few lines of code (less than 10), to me it makes sense to just pop it onto the end of the call to set-up mu-plugins.

  19. dsilverman
    Member
    Posted 16 years ago #

    Why do you think it wouldn't make it into trunk? I don't know about the blog_id 1 folder but I'm sure many installations have plugins that need only be executed on the admin screen, and it seems like a (comparatively) easy addition to the trunk code that would make many people's lives easier...

  20. drmike
    Member
    Posted 16 years ago #

    Donncha seems to be unwilling to add in modifications directly to the truck code. A perfect example of this was the dropdown menu on the dashboard.

    In my opinion of course.

  21. lunabyte
    Member
    Posted 16 years ago #

    Just personal observation, really. Not saying it isn't a good idea, I'd love it to be available and make one less core edit for me.

    However, to half the folks out there it would probably be too confusing. They can barely keep the plugins and mu-plugins directories straight, let alone know what they are for and how they work. Adding in a 3rd one would probably make that worse, and in the end do more harm than good overall.

    To me, that's the sole reason I can go in and modify the core to suit my individual needs. Because MU (or WP for that matter) can't be everything to everyone, but I'm free to modify on top of the original tools I've been given so that I can make it sing and dance.

  22. lunabyte
    Member
    Posted 16 years ago #

    Hey D, I was reading back through this thead, and I have to agree with you.

    Your site is much, much improved in performance. Over all page weight is light too, and definitely needs a thumbs up for it.

    It took work to get it there, I can tell even without you mentioning it, but the overall effect should be worth it in user comments alone I would imagine.

  23. lunabyte
    Member
    Posted 16 years ago #

    @Doc, I agree.

    Not sure if it is purely not wanting to, or having his hands tied on what is acceptable to add into it.

    Keeping in mind that naturally the really good stuff is left to the end users to come up with, so that MU out of the box can't exactly compete with their bread and butter over in .com land.

  24. andrewbillits
    Member
    Posted 16 years ago #

    I get the impression that his hands are tied.

  25. lunabyte
    Member
    Posted 16 years ago #

    I couldn't agree more. Which, I mean, is understandable. I still think we're lucky to have MU at all, other than it provides a pretty solid testing ground for the code overall.

  26. andrewbillits
    Member
    Posted 16 years ago #

    Yeah, it's not really his fault. It's just the nature of the game. However, I do miss the old days where features were being added quite frequently.

  27. andrea_r
    Moderator
    Posted 16 years ago #

    hey, I'm nodding my head in agreement here too.

  28. lunabyte
    Member
    Posted 16 years ago #

    I'm sure you are.

About this Topic

  • Started 16 years ago by dsilverman
  • Latest reply from lunabyte