ELO vs. TRUE SKILL

Discussion in 'Planetary Annihilation General Discussion' started by tatsujb, July 8, 2013.

?

ELO or TRUE SKILL?

  1. ELO

    18 vote(s)
    29.0%
  2. TRUE SKILL

    44 vote(s)
    71.0%
  1. tatsujb

    tatsujb Post Master General

    Messages:
    12,902
    Likes Received:
    5,385
    I don't even like that song :(
  2. tatsujb

    tatsujb Post Master General

    Messages:
    12,902
    Likes Received:
    5,385
    BOOOOOOOOOM baybe
  3. stormingkiwi

    stormingkiwi Post Master General

    Messages:
    3,266
    Likes Received:
    1,355
    I don't even understand anything at all
    Zoliru likes this.
  4. ghostflux

    ghostflux Active Member

    Messages:
    389
    Likes Received:
    108
    So how is a skillrating system not hard to do, when I've yet to see a truly good one. More than often I get stuck in this ELO limbo, in pretty much every game that has it, and I end up just either being way better than the people I get matched up with, or way worse.
  5. liquius

    liquius Well-Known Member

    Messages:
    731
    Likes Received:
    482
    Well that just depends on a few small factors. If the pool of players looking for a game is too low you either have to wait a long time or you get games with larger skill gap.

    Also if you are playing in multiply regions then it won't help. There is often a skill difference between regions due to people only playing on one region.

    You can make them as simply as wining gains you points and losing loses you points. If your opponent has a higher rating then you gain more/lose less points. If your opponent has a lower rating then your gain a few or lose lots depending on win/lose.

    In my experience rating systems that go into more depth then that often fail to capture peoples skill more then the way I stated.
  6. ghostflux

    ghostflux Active Member

    Messages:
    389
    Likes Received:
    108
    This is unfortunately very often the case. Many smaller games use this despite not having the playerbase to make the system work. Larger games on the other hand often offer seperated leagues where it's easy to find games when you're new or relatively average, but it gets increasingly harder to find a game once you're above average.

    I am somewhat familiar with how the ELO system works, but the practical use of such a system due to the constraints it has such as requiring a large pool of players is fairly limited.

    If this is true, then that just means that making a proper rating system is complex beyond our current knowledge. Which is why I questioned if it was easy to make.
  7. cola_colin

    cola_colin Moderator Alumni

    Messages:
    12,074
    Likes Received:
    16,221
    Making a decent rating system is no magic or overly complex task I would say.

    It's not the fault of ELO/Trueskill/whatever if it fails for that reason. With a very small playerbase you basically have no way at all to constantly get equal matches. If there is no equal match available there just is none. No rating can fix that.
    stormingkiwi and kayonsmit101 like this.
  8. Dementiurge

    Dementiurge Post Master General

    Messages:
    1,094
    Likes Received:
    693
    The only way I can see rating systems being perceived as adequate in video games is if they can fairly bucket 90% of players in less than 5 games, and accurately place 50% of players in less than 20, while at the same estimating conservatively (so a successful 1-game player doesn't get thrown into the top bucket). ELO needs hundreds of games to do this, while True Skill needs dozens.

    I'm not even sure if such a goal is possible.
  9. sebovzeoueb

    sebovzeoueb Active Member

    Messages:
    110
    Likes Received:
    71
    I'm not sure what system the Starcraft 2 ladder uses, but I've found it to be pretty decent at putting you against a player of a roughly similar level most of the time, and adding/deducting fewer or more points based on the opponent's ranking. While your exact ranking within a league isn't super meaningful, I think the leagues themselves provide a pretty good idea of a player's skill.

    Obviously for team leagues things get a little more complicated, but I like the fact that when you play as a party your party gets a ranking rather than each individual player. I haven't really played higher more than 1v1 on my own, so i can't speak for the accuracy of the ladder when playing for example 3v3 with randoms.
  10. tatsujb

    tatsujb Post Master General

    Messages:
    12,902
    Likes Received:
    5,385
    would you say FAF's playerbase is big or small? because either way it's rating and matchmaking system works and most people criticize FAF as having a "puny" playerbase.
  11. cola_colin

    cola_colin Moderator Alumni

    Messages:
    12,074
    Likes Received:
    16,221
    I've left FAF because the ladder, even though technically a pretty good implementation, did not feel very alive to me.
    I often searched without any results and just gave up in the end. So I'd say at least in the higher rating regions it did (does?) suffer from a lack of players.
    stormingkiwi likes this.
  12. tatsujb

    tatsujb Post Master General

    Messages:
    12,902
    Likes Received:
    5,385
    try it again, it's been tweaked and perfected, everybody has a blast now.

    starcraft 2 uses elo and it's the entire question, PA is not the same game, and it requires a rating system adapted to it. In starcraft you can get away with making a 3 v 3 pass as a 1 v 1 and redistributing the scores equally. In Pa you likely can't. there's just too many things in motion to call it chess. even if, when you dumb it down to the utmost, it is chess.
    Last edited: December 25, 2013
  13. Devak

    Devak Post Master General

    Messages:
    1,713
    Likes Received:
    1,080
    Well if it´s really just math then..yea it´s not hard. I think it´s a matter of checking what exactly Microsoft patents for TrueSkill.
  14. jacob29

    jacob29 Member

    Messages:
    58
    Likes Received:
    8
    Why would we use the Electric Light Orchestra?

    oh you mean Elo... I see..
    stormingkiwi likes this.
  15. cola_colin

    cola_colin Moderator Alumni

    Messages:
    12,074
    Likes Received:
    16,221
    This whole "oh noes patents" is pretty weird imho. The basic idea behind elo, glicko, trueskill and whatever else is always roughly the same as far as I understand. So just reading into all of those and afterwards implementing your own thing is not an issue at all. Call it UberRating and be done with it.

    As I said a lack of players at your skill level cannot be fixed by improving rating systems. And I am fairly certain the issue was that there were just too few active good players. In fact I just checked it still looks just as bad as it did maybe a year ago when I gave up: #1 has currently 2219 points, #50 has 1629 points. So if you happen to play at the top 1800+ or 1900+ the ladder feels very dead.
  16. tatsujb

    tatsujb Post Master General

    Messages:
    12,902
    Likes Received:
    5,385
    this.
    point taken, I guess it's only as enjoyable when, like me, you're in the bottom lot. still though, right now it's christmas and there aren't that many people on but faf is now used to having 800 players in the evening regularly, and the community's growing. that's quite a bit more 2000ers, and those guys tend to hit ladder button more frequently then anyone else. not only that but now you have a notification on the bottom of faf, visible at all times when a player of your ranking is searching complete with a button for each race, you just click the race button and you're in.
  17. bradburning

    bradburning Active Member

    Messages:
    187
    Likes Received:
    102
    We are lucky that we already have games like Star Craft 2 and League of Legends to look at for ranking systems, they are both very similar ranking systems but both handle multiple people very differently.

    SC2 gives a separate ratting for everything so you have a 1v1 ranking, xvx for each person(s) you queue with and another for when you que with random.

    League of legends has a much harder time doing this because well it's a 5v5 game this makes it tricky. So you just have one ratting for you as a player and you can only que with a one other person if you chose to.

    We can look at these two examples for ages but the big thing they have over us is number of players so making any system that they use viable for PA is going to be near impossible so let's look at what may be the closet game to us in terms of the size of its player base. Company of Heroes 2.

    So CoH2 ladder system did not come to the public until about 6 months after release and not even in game. After the community heads over at coh2.org where give access to the stats where they able to put something up though there site. When the ladder went up it the guy on top had not played since early beta and had only played about 30 games and the way the rankings where done no one could get above him with his pretty mush perfect win rate.

    So the point I am making there is that a games ladder needs a half life to some degree so say it gives less waiting to games over a mouth old ect. The ladder should be representing the current best players. Also the number of game you have played should not impact your ranking nearly as much as who you played against so someone does not need to play a lot to be at the top.
  18. tatsujb

    tatsujb Post Master General

    Messages:
    12,902
    Likes Received:
    5,385
    ...SOo... exactly like FAF :) ?
  19. bradburning

    bradburning Active Member

    Messages:
    187
    Likes Received:
    102
    I assume so, I am not fermiular with FAF ranking system. What would be cool is to also have a best of all time ladder that does take in account every game from release I guess which should be a point where the game is reasonably balanced.
  20. stormingkiwi

    stormingkiwi Post Master General

    Messages:
    3,266
    Likes Received:
    1,355
    I still don't understand the jokes in this thread.

    I do know what ELO is now. So I feel happy.


    Theoretically, if I had an alt, and I was a real pro player, and I beat everyone as myself, and then I played myself as my alt (and only played noobs while playing my alt), regardless of the way the ladder is programmed, my alt should be better shouldn't it?

    Edit: Ignore the logic fail there that because I only play noobs as my alt, I must in fact be noob beacuse I'm playing myself with my alt

Share This Page