Setting up a community dedicated-monster-of-a-server

Discussion in 'Planetary Annihilation General Discussion' started by cola_colin, October 10, 2014.

  1. harrierx

    harrierx Member

    Messages:
    81
    Likes Received:
    46
    I'm not sure how much weight you should put on the numbers reported by top. But anyway, I'd speculate that an overclocked 6 core CPU might perform better for the following reasons: a) Since there are less cores, you'll probably achieve a higher stable frequencies for the cores. b) Distributing the game engine work load efficiently over the CPUs is hard, so I'd guess that the game is bottlenecked by e.g. single threaded AI code. c) The engine is probably tuned for 4-core CPUs.
  2. harrierx

    harrierx Member

    Messages:
    81
    Likes Received:
    46
    Somebody should ask the devs how well the engine is expected to scale beyond 4-6 cores, or do some experimenting before spending $$$s.

    Some hosters offer hourly billed bare-metal servers, which might be ideal for experiments. This one might be ideal for checking out what you can squeeze out of a 2 CPU configuration, assuming that you can rent spot instances hourly: https://www.ovh.com/us/hpc/
  3. cola_colin

    cola_colin Moderator Alumni

    Messages:
    12,074
    Likes Received:
    16,221
    Well there is *something* that does distribute pretty well over more than 6 cores. So having 20% more per core speed won't help if you have not enough cores to have "one task per core". The bottleneck is the sim speed, which as far as we know, is a single thread. We need to get that single thread on the best core we can get and make sure that core is not bothered with other work basically.
    I don't see why the engine would be tuned for 4 cores, nothing points towards that conclusion.
    But sure that's all speculation.

    EDIT:
    Well we did experiments with 16 core amazon servers and if top is right it used more than 8 cores. The server also logs something about a thread-pool that had 14 threads on the 16 core machine, so there is something that they think can be spread over a threadpool. I *suspect* it is related to managing per player data.

    EDIT 2:
    That suspection is based on the fact that a 32 player game used 800-900% cpu on a 16 core server at low sim speed due to many units
    The 24 core opteron server used ~600% with a 20 player game at low sim speed due to many units.
    Me alone on a quadcore server never used more than a little more than 200%.
    All according to top.
    Last edited: October 18, 2014
  4. harrierx

    harrierx Member

    Messages:
    81
    Likes Received:
    46
    Because that's the most common configuration found in todays PCs? Which doesn't mean that the engine can't scale beyond 4 cores, just that it might not do so well. Also, I'd assume that the multithreading primitives and data structures used in the code won't be tuned for multi-cpu NUMA configurations, but without any knowledge of the codebase this really is just speculation.
  5. cola_colin

    cola_colin Moderator Alumni

    Messages:
    12,074
    Likes Received:
    16,221
    Why would the server be optimized for that kind of pc? It's expected to run on cloud servers for the most part.
    So far the server only run on amazon cloud servers in a "one core per server instance" kind of matter. Garat said the server has only 2 threads overall: One sim-thread and one server-thread. Based on the findings from actually running big games I kinda think that garat's information aboout that is outdated, as it definitely tends to use many more cpu cores if the player count is high and it also logs "Started threadpool with X threads" where X is the number of cpu cores - 2
  6. harrierx

    harrierx Member

    Messages:
    81
    Likes Received:
    46
    You're right I was thinking about desktop PCs.

    Cloud servers are pretty different beasts, though the most "affordable" options, which Uber likely uses and tunes for, won't give you more than 4 physical cores either, and in addition the performance suffers from the VM overhead and other users on the same physical machine.
  7. harrierx

    harrierx Member

    Messages:
    81
    Likes Received:
    46
    That just means that it started some threadpool with X threads, not that e.g. the sim work actually runs on the threadpool. For example the network IO might run on the thread pool, but probably not the core simulation code.
  8. cola_colin

    cola_colin Moderator Alumni

    Messages:
    12,074
    Likes Received:
    16,221
    Sure that does not mean the sim has more threads. I think the sim is a single thread, as I said before.
    But there are other things, probably related to the number of players, that are processed by that threadpool.
    Read my edits above ^^
    Basically based on a bunch of tests on cloud servers with mostly 16 cores and up to 32 players my current assumptions are:

    - There is one sim thread. This is the part that I'd love to see improved ;)
    - There are a bunch of more threads, mostly from that threadpool that sizes itself based on the available cores. They do *something* that becomes more and more work as the player number rises. So if you have not enough cores to put this something on it will end up on the same core that does the sim work => bad.

    EDIT:
    Since the max sim speed is limit by a single core it may probably be "best" to go with a high clocked hexa core i7 and play at max with 20-30 players to get the best ratio between sim speed and player numbers.
  9. cola_colin

    cola_colin Moderator Alumni

    Messages:
    12,074
    Likes Received:
    16,221
    The nodes are pretty similar to the amazon servers that I've run most big games on so far.
  10. harrierx

    harrierx Member

    Messages:
    81
    Likes Received:
    46
    If the sim thread is the bottleneck, which seems likely, I'd assume that an overclocked quad core with lots of memory, an SSD and a gigabit port (with full bandwidth) probably will be best for performance.

    Other than the simulation there isn't much to do on the server except IO which I'd guess should run fine on 3 modern Intel cores.
  11. harrierx

    harrierx Member

    Messages:
    81
    Likes Received:
    46
    The AWS servers are shared VM servers, which makes a big difference. (I assume that the OVH spot nodes are physical machines with the specified Xeon configuration.)
  12. cola_colin

    cola_colin Moderator Alumni

    Messages:
    12,074
    Likes Received:
    16,221
    I though so as well, but that's the part my tests with big games heavily disagree with.
    Why would the server use more than 800% according to top on an Intel Xeon E5-2680v2 based 16 core server?

    EDIT:
    Also I've not seen any steal time in top on any of those aws instances.
  13. cola_colin

    cola_colin Moderator Alumni

    Messages:
    12,074
    Likes Received:
    16,221
    So after some more discussion in the irc I say:
    Let's do some tests :)

    I wonder if we can get 20-30 players for some test games tomorrow afternoon till evening? So basically try to setup the same game first on an bare metal Quad Core Xeon 1270 V3 from softlayer and then on a 16 core aws instance? Or maybe the other way around, so I can measure the bandwidth usage on aws as to know wether we need more than 100mbit/s on the softlayer server, as that is a little more costly there.
    It would be especially cool if we could get people to select the same slots/spawns in those games and even try to play the same unit compositions.

    I'd use maybe 10$ or so of the current donations for that. Basically the money I donated myself xD
    cwarner7264 likes this.
  14. tatsujb

    tatsujb Post Master General

    Messages:
    12,902
    Likes Received:
    5,385
    I'm going to build something off of a Core i7-5960X Extreme Edition instead
  15. carn1x

    carn1x Active Member

    Messages:
    389
    Likes Received:
    156
    Sorry slightly off topic. How are you exceeding 10 players? Is this a mod?
  16. Tontow

    Tontow Active Member

    Messages:
    459
    Likes Received:
    64
    Sorry for the confusion, I meant advertisement.

    Even though it may not generate that much money per view, it will stack up very fast. I don't think anyone would object to having a small bar at the top or bottom of the screen that has an advertisement in it.

    [​IMG]
  17. cola_colin

    cola_colin Moderator Alumni

    Messages:
    12,074
    Likes Received:
    16,221
    Yes. You need to modify the server and use the unlimited players ui mod for the host.
    squishypon3 and carn1x like this.
  18. Quitch

    Quitch Post Master General

    Messages:
    5,850
    Likes Received:
    6,045
    I found spectators also need the mod if you break the standard spectator limit, otherwise they don't get the option to join.
  19. cola_colin

    cola_colin Moderator Alumni

    Messages:
    12,074
    Likes Received:
    16,221
    Well that's new. Never tried to have that many spectators before.
  20. teddythebear

    teddythebear Member

    Messages:
    50
    Likes Received:
    21
    Do replays create the same server load? Could a replay be used to benchmark different server configurations? It would be nice if you didn't always need a group of players around to test and it would give more repeatability.

Share This Page