The MU forums have moved to WordPress.org

Thoughts on Scaling (9 posts)

  1. matt
    Key Master
    Posted 18 years ago #

    Just some thoughts about architecture and ideas:

    It would make sense to have a single global user DB with some minimal information, which could be easy replicated to MySQL slaves and even if it gets a hundred million rows that's fairly easy to scale. Why users and not blogs? Because this would allow a single user to have multiple blogs.

    It would also be nice to have a global blog lookup table which could be the key for flexible domain addressing. So a WPMU server gets a request for "matt.example.com" with a path of "/dog/archives/". It looks up matt.example.com in the database and see's that blog is on DB cluster 4 and /dog is the Dog Blog which has an ID of 94105 and a table prefix of 'dogblog_'. Then all the heavy crunching can be done against the individual cluster.

    It would also be possible to have some sort of domain based key with random distribution, for example you could take an MD5 hash of the incoming username, then use the first character of that hash to go to one of 16 clusters or databases. If you grow beyond that start using the first two characters, which increases your number of machines by a factor. Then when you grow more use the first 3 characters... :)

    If everything is on one machine, all the pointers can be for localhost, databases on the machine, and local table prefixes.

  2. andrewbillits
    Member
    Posted 18 years ago #

    but what if the need for several machines were to occur. If you have several hundred databases on one machine, everything will slow down. Wouldn't it be better to have the databases spread accross several machines at that point? or atleast the databases on a separate machine from the one serving up the site itself.

  3. NetAndif
    Inactive
    Posted 18 years ago #

    I like this idea.
    What about of one machine with the frontend, one with all user tables, and one for every bloghosting server, which has a copy of his own user tables, which are in sync with the main db. All this implemented in a big cluster, hosting hundreds of virtual machines... and so on... ;-)

  4. andrewbillits
    Member
    Posted 18 years ago #

    the main reason for my idea is because you have to think about downtime. WPMU is a very database intensive setup and with several hundred of thousand blogs databases will eventually have an error every now and then. I think it's better to have the actual site on one machine and databases on another because if a process goes haywire and MySQL sends the cpu sky rocketing, atleast you will have decent load times on the actual site and be able to tell users there is a database problem. Otherwise the entire site would be inaccessable and users would flock to wordpress.org asking what the heck is wrong.

    just my thoughts.

  5. matt
    Key Master
    Posted 18 years ago #

    Andrew, in this scenario you can have one or a few databases on a single machine or the blog databases spread across any number of machines. It's much easier to scale because you don't have to have complicated replication, you can just scale horizontally. (By adding more machines.)

  6. andrewbillits
    Member
    Posted 18 years ago #

    Basically my scenario is just yours, but slightly modified with the site and databases served on different machines, no replicating any data. Replicating data would in my opinion be pointless for this application as it would create a 5 second to 5 minute delay on all data being parallel and a lot of wasted resources.

  7. andrewbillits
    Member
    Posted 18 years ago #

    bassically we need to develop wpmu so that the client has two options when installing wpmu.

    One: Install on a single db
    two: Install on scalable clusters. we would also need to develop into the admin console away of managing these clusters (stats, db size, etc), but only have this appear if the scalable option is chosen.

    While i'm on the subject of admin console. Instead of having the admin console as a plugin accessible only from the "main" blog. Wouldn't it be better to create a stand alone control panel that would allow for many admins with different access levels. I just believe that it would be more secure this way instead of having it as a plugin.

  8. amanzi
    Member
    Posted 17 years ago #

    Hey - just found this old thread, and as I've posted a few other questions about scalability, I wanted to bump this to see what thoughts have evolved regarding the scalability of WPMU.

    To me there are three areas of WPMU which need to be handled differently for scalability: the site content, user-uploaded content, and the database/s.

    • Site content would be the actual WPMU files, or, the "front-end". This can be scaled by adding several load-balanced webservers.
    • User-uploaded content is the stuff that users upload in the "write post" screens. This needs to be stored on a separate server if there are multiple frontend servers. The easiest way to do this seems to be to use NFS to mount a NFS share at the "blogs.dir" directory. The NFS servers can then be scaled using replication or possibly even a SAN.
    • The databases are then stored on a dedicated database server or a cluster of servers for scalability. Unfortunately I'm no SQL guru so I can't say whether it's better to have fewer, larger databases (like Lyceum) or more, smaller databases (like wordpress.com)

    A WPMU site should be able to be scaled fairly easily but anyone needing to scale the service past 10,000 blogs should have a pretty indepth understanding of how the entire system works.

    In my mind this is the sequence that the scalability should follow:

    1. For up to a few thousand blogs, everything can run on a single, dedicated server.
    2. For more than a few thousand blogs up to around 10,000 blogs, move the database/s onto a dedicated server. You should also start looking at redundancy of the database server - probably a cluster.
    3. For tens of thousands of blogs, the user-uploaded files will need to move onto a dedicated file server. If you haven't implemented any redundancy yet, now would be the time to do it.

    Once you get to step three, everything can be scaled horizontally. More front-end servers can be added to handle the load. Multiple file servers can store the hundreds (or thousands) of gigs of user-uploaded files. And more database servers can be added to the cluster to spread the databases over more servers.

    Other technology can be added to the mix to improve performance too, e.g. squid proxy servers can be added for improved caching of the webcontent; SANs can be used to store the data; and there's probably SQL technology that I don't even know about that can improve the performance from that side.

    I'd love to hear more from others about their thoughts on scaling WPMU...

  9. drmike
    Member
    Posted 17 years ago #

    Old thread but why wouldn't this work? When php hits db.php, take the username.mydomain.tld, do a lookup against a seperate table that exists within it's own database. Certain blogs would be in database one, some in db #2, etc, and have db.php return the necessary information.

About this Topic