The MU forums have moved to WordPress.org

Databases (22 posts)

  1. frozonecold
    Member
    Posted 15 years ago #

    I am working on a website that I need to make scalable and I remembered all the scalability threads here. I was wondering if you guys could answer a few simple questions.
    1. What is the advantage of separating the databases by a hash of the blog id instead of by splitting the database by blog id (ids 1-500, 500-1000)?
    2. What are the advantages of having 4096 databases instead of 16? I would think that managing 4k databases would be a pain in the ass.
    3. For those of you splitting your databases by the blog id hash, what are your database names (curious as to what those 16-4096 names are)?
    Thanks guys.

  2. tdjcbe
    Member
    Posted 15 years ago #

    1) Load balancing mostly. You can spread the load among the different servers if you ever get that large. If you wind up with 501 blogs with your method, you've got one server with 500 databases and another server with 1 database. In theory, hashing spreads out the load evenly.

    2) Number of files. Some OSes moan after a bit if you've got a huge number of files within a subdirectory. Remember with a standard blog making 8 database tables per blog, that's 24 files created on the hard drive. Those add up quickly.

    It's also easier to optimize smaller databases and if you ever have to go into phpmyadmin and manually edit a database, the program won't choke as much.

    3) I name them after all the partners Gene Simmons has had. :) Actually they get names accountname_db### where ### is the hex number of the database in question. (ie 000 to fff)

  3. lunabyte
    Member
    Posted 15 years ago #

    Another note about blog_id X-XX vice hashing, and carrying on the balance theory, if you delete blogs (like splogs, etc) that eats up blog_id's.

    So, essentially, with the first method, you end up with an unbalanced db load, as the list of blog id's looks like swiss cheese.

  4. frozonecold
    Member
    Posted 15 years ago #

    Thanks mate. I have a question about number 3. I know that wordpress uses the first three letters of the hashed blog id, are those hex values, if so where can I find a list of these values. Thanks again mate.

  5. frozonecold
    Member
    Posted 15 years ago #

    Thanks lunabyte, that didn't bother me so much as the unbalanced load.

  6. lunabyte
    Member
    Posted 15 years ago #

    Easiest thing out there: The Multiple DB plugin from the WPMU Dev Premium site.

    All the work and hassle done.

  7. frozonecold
    Member
    Posted 15 years ago #

    Sorry, I did not make this clear. I won't be using WPMU, I posted my questions here, because this is the only forum I have found with scaling experts. I want to take what works with mu and bring it to a completely different codebase. I can code a database connection script, I just wanted to know how and why y'all do the things you do. I really don't want to pay $50 to find out what the first 16 possible first letters are in a hash though.
    BTW, thank you Lunabyte you have helped me a lot.

  8. lunabyte
    Member
    Posted 15 years ago #

    Yeah, that won't help if you're not running MU.

    However, same theory applies.

    Hashes balance out pretty well, regardless of what the id actually is.

  9. frozonecold
    Member
    Posted 15 years ago #

    I know, I was hoping someone could tell me what the 16 possible first letters for a hash are though.

  10. andrewbillits
    Member
    Posted 15 years ago #

    0-9 and a-f = 0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f

    Thanks,
    Andrew

  11. frozonecold
    Member
    Posted 15 years ago #

    Thanks, is there some sort of code that will generate the possible first 2 digits, or even 3 digits.

  12. frozonecold
    Member
    Posted 15 years ago #

    It would be every possible combination of those 16, but how could I generate these numbers?

  13. lunabyte
    Member
    Posted 15 years ago #

    Make a hash from a consistent source, and grab the characters desired.

    1=16, 2=256, 3=4096.

    Since this isn't MU related, there are probably better forums for this discussion though.

  14. frozonecold
    Member
    Posted 15 years ago #

    I would need at least 256 hashes to find all the possible first 2 letters of a hash.

  15. lunabyte
    Member
    Posted 15 years ago #

    There ya go then.

    Make a hash, get the first 2 characters, call it a day.

  16. frozonecold
    Member
    Posted 15 years ago #

    That would take all day. Or rather a few hours. I'll start now. 00, 01, 02, 03, 04, 05, 06... you see how this would get annoying. Is there any sort of generator for this? Maybe Andrew has a list?

  17. frozonecold
    Member
    Posted 15 years ago #

    I did it, but now I'm wondering if I should use the first 2 digits. Either way here are the possibilities for the first 2 letter of a hash (note I haven checked all of them):
    00
    01
    02
    .....
    .....
    .....
    f8
    f9
    fa
    fb
    fc
    fd
    fe
    ff

  18. lunabyte
    Member
    Posted 15 years ago #

    Wow. That was pretty rude posting that entire string.

    Seriously, if you're going to write something like this, you should already know enough about it to accomplish this.

    Honestly, since this is completely not related to MU, I'd be willing to bet that getting help with such a thing will be overlooked.

    I'd suggest finding a more suitable forum for your topic.

  19. andrewbillits
    Member
    Posted 15 years ago #

      I'll start now. 00, 01, 02, 03, 04, 05, 06... you see how this would get annoying.

    That was just a touch uncalled for. If you're looking at using 4096 databases for *anything* you should be able to come up with a script to generate a list.

      I'd suggest finding a more suitable forum for your topic.

    +1

  20. frozonecold
    Member
    Posted 15 years ago #

    I think it is a little late to find a more suitable forum, as I have already received answers to all of my questions. My script failed, so I did it by hand and posted it, because I figured that someone trying to split their database would find it useful.

  21. cafespain
    Member
    Posted 15 years ago #

    Sorry, I lost track of this half way down :) and don't quite get why the OP wants every possible combination?

    Can't you just use hexdec on the first x characters and name the databases db_1 to db_4096? Seems a lot simpler to me :)<

    (Look I used OP :) .... Do you guys realise how long it took me to work out what that mean't?????

  22. frozonecold
    Member
    Posted 15 years ago #

    I wanted every possible combination because those were every possible first 2 digits of the hash. Your way is admittedly much cleaner though. If this works with my calendar I will eventually move it to my intranet MU install.
    BTW: I had to spend 20 minutes of thinking to figure out what OP stood for, so don't feel bad.
    BTW again: Why did you trim the string? It is useless to other's now. I only posted that to make it easier for others who want to implement such a structure.

About this Topic

  • Started 15 years ago by frozonecold
  • Latest reply from frozonecold