Being close to reach the dreaded 32000 blogs limit in wpmu, I've been studying a scaling architecture that would allow to scale virtually indefinitely an MU site.
In this thread : http://mu.wordpress.org/forums/topic.php?id=1343 , Donncha explains some parts of wp.com setup. Matt on his blog has also given some clues as per their layout. Basically, from what I understand, a cluster of web servers, multiple rsync'ed nfs servers and database servers.
I wasn't too looking forward to implementing such a thing, because NFS means UDP, means a lot of bandwidth, means a VPN, means expenses. And because the whole setup meant quite a few modifications to the code, and because to me it looks like there would be more and more problems when adding new servers.
So, I've been looking into implementing another solution. You're probably interested in reading the following (or at least i'm interested in you reading it ;-) ) if:
- you're a WP developper.
- you have a sysadmin experience
- you plan on having thousands, tens of thousands or more users in your service.
I'm looking for feedback on the architecture I plan, so that if someone spots a weakness I would have overlooked I can avoid wasting a lot of time on its implementation.
Here you have a graph of the solution I plan to implement:
http://www.creerunblog.fr/scaling.gif
The idea is to receive all incoming requests on a single HTTP server.
This server, using mod_proxy and mod_rewrite, will route requests to X backend servers, acting as a reverse proxy.
This can be done very simply once mod_proxy is installed, by adding lines such as:
RewriteEngine on
RewriteRule ^t(.*)$ http://somewhere.com/ [P,L]
This would route all requests starting with a t to the site somewhere.com and present its contents to the user as if delivered by the front server.
In the following I'll give example with my own site urls ( http://unblog.fr/ )
So, with a set of directives like:
RewriteEngine on
# redirects abc.unblog.fr to abc.unbloga.fr
RewriteRule ^a(.*)\.unblog\.fr(.*)$ http://a$1.unbloga.fr$2 [P,L]
# redirects bcd.unblog.fr to bcd.unblogb.fr
RewriteRule ^b(.*)\.unblog\.fr(.*)$ http://b$1.unblogb.fr$2 [P,L]
RewriteRule ^b(.*)\.unblog\.fr(.*)$ http://c$1.unblogc.fr$2 [P,L]
RewriteRule ^b(.*)\.unblog\.fr(.*)$ http://d$1.unblogd.fr$2 [P,L]
etc.
And combined with the appropriate entries for unblogX.fr in bind, I can route requests to different servers based on the blog name.
Then, on the backend servers, I have one instance of MU installed for each domain like "unblogX.fr", with the scripts, blog files and database located on the same server.
Only the main set of tables is located on the front server (for instance), and needs to be modified so that queries on the main tables go to this database instead of the localhost.
Advantages of this solution over the wp.com one:
- Much less code needs modification regarding database access and uploaded file access. Only the requests to the main tables need to be redirected to the main database, otherwise all requests are treated locally.
- No need for NFS drives.
- No need for dynamic sync of any member in the system. Every instance of MU acts as a normal one with its own set of bloggers, its own set of uploaded files, its own DB. This means optimal performance on the local machines, and very little overhead overall when compared to a simple MU installation, with just the data transfer between the proxy and the backend server added.
- No problem with the 32000 bloggers limit anymore. If any letter reaches 32000 bloggers, you can split it with rewrite rules like:
RewriteRule ^aa(.*)\.unblog\.fr(.*)$ http://aa$1.unblogaa.fr$2 [P,L]
RewriteRule ^ab(.*)\.unblog\.fr(.*)$ http://ab$1.unblogab.fr$2 [P,L]
RewriteRule ^ac(.*)\.unblog\.fr(.*)$ http://ac$1.unblogac.fr$2 [P,L]
etc.
- One server gets overloaded ? A disk gets full ? Add a new server and split some of your users on it. Easy !
- You can split in as many environments as needed. Smaller databases -> easier backups, easier script updates, less downtime for users (particularly when upgrading DB.
- It's easy to beta-test new features, just create a test environment like:
RewriteRule ^test\.unblog\.fr(.*)$ http://test.unblogtest.fr$2 [P,L]
and do your changes on the unblogtest environment. When yo're happy, push files to the other environments.
Drawbacks of the solution:
- By default the environment unblogX.fr is meant to run on this domain and will generate html pages with this domain. Most requests in MU should be relative, but if there are some absolute paths the domain needs fixing. This can probably be done by cheating the HOST variable in mu.
- The IP of incoming requests is always the proxy one. IP detection in MU needs to be replaced to check the HTTP_X_FORWARDED_FOR or HTTP_VIA variables.
- wp-newblog.php needs to be modified to create the blog in the right environment.
- Many entries need to be created in bind, and although you don't need to actually register the domains, I'm not too sure what happens if unblogX.fr exists.
- A script to move users tables and files around is needed (would be in any multi-database setup).
That's about all I see. Please tell me if you think I overlooked something or about the big hidden drawback I didn't think about.