Perception vs. Statistics on the Hearthstone Ranked Ladder

When Hearthstone, Blizzard’s take on a collectible card game, first came out, I was hooked. It tapped into my nostalgia for all the Magic: The Gathering I played when I was younger, but offered a much smoother playing experience than what was available for other online “card” games at the time. I played for a few years, but eventually grew bored of Hearthstone. Now I’m back on the online CCG bandwagon with the release of Magic: The Gathering Arena, the new Hearthstone-ified version of the original TCG/CCG.

Playing an online CCG again reminded me of one thing that never really made sense to me about Hearthstone: the way its ranked ladder worked. Most competitive online games use some variant of the Elo rating system. In such a system, each player has a number which corresponds to their skill level – whenever they win a game, that number goes up, and whenever they lose a game, that number goes down. The magnitude in the change of Elo rating is equal for the two players involved in the game (e.g., if I lose to you and consequently lose 5 points, that means you would gain 5 points) and depends on the difference in rating between the players. For example, if a player beats a much lower rated player, they will only get a small increase in rating, but if they lose to that much lower rated player, they will experience a large decrease in rating. Typically this Elo rating is then translated into a rank/tier/division, which broadly groups players by skill level. For example, in Rocket League, the lowest ranked players are Bronze, followed by (in ascending order): Silver, Gold, Platinum, Diamond, Champion, and Grand Champion.

Hearthstone, Magic Arena, and most online “card” based games don’t seem to follow this method, at least not until you get to the very top tier of the ladder (e.g., “Legendary” for Hearthstone, or “Mythic” for Magic). For the portion of the ladder relevant to most players, you start at (or near) the lowest rank at the beginning of every month, a win gets you one “point” (the exact terminology varies game to game), a loss loses you one point, and there are regular point cutoffs associated with each rank (e.g., each subsequent rank requires 5 more points than the last one). There are some specific caveats to each game, but that general system forms the foundation of the ranking system.

At the surface, it might seem like there’s not really a difference between these two systems. Both systems are zero-sum: any points lost by one player are gained by another. And assuming there are enough players so that you’re always playing someone of roughly your same skill level, then the importance of each game is about the same. What is different is the shape of the distribution of player skill ratings, which matters because in the card game system, this distribution is fundamentally connected to player ranks.

In the Elo system, the distribution tends to resemble a Bell curve, since each game should have roughly 50% chance of going either way, like a Galton Board. However, this distribution isn’t particularly important, since the developers can set arbitrary cutoff points for each rank. Given that the Gold and Platinum ranks are in the middle of the possible Rocket League ranks, players expect that an average player would be Gold or Platinum, and the Rocket League developers can set the ratings corresponding to Gold and Platinum to be in the middle of the distribution. If they found that players were happier when they were classified as a higher rank, they could arbitrarily shift the ratings so that Diamond corresponded to the middle of the distribution, without changing how the Elo rating system works overall.

The card game system would also resemble a Bell curve if it was truly zero-sum, but at the bottom of the ladder it is not actually zero-sum. Someone with zero points can’t lose points, so games at the bottom of the ladder result in a net positive production of points. By preventing players from going below zero points, the ranking distribution looks asymmetric and long-tailed, like a Pareto distribution, or a Chi-squared distribution with only one or two degrees of freedom. (Since this can be modeled as a diffusion process, I think of it as being like the temperature distribution in the heating of a semi-infinite solid.) The shape of this distribution wouldn’t matter if the developers set ranking cutoffs arbitrarily, but by keeping the point cutoffs regular or mostly regular, this long tailed distribution in points leads to a long tailed distribution in player ranks as well.

Since I’m sure most players don’t spend as much time thinking about distributions as I do, the card game ranking system becomes problematic because the perception of players’ rankings greatly differs from the reality of the ranking distribution. Back when I played Hearthstone, the effective lowest rank (corresponding to zero points in my above description) was 20, and the highest rank was 1. Consequently, a player might reasonably assume that an average skill level would get them to rank 10 or 11, when in fact, due to the skewed distribution, the median player might be at only rank 17 or 18.

So if you’re committed to this “card” ranking system, with its skewed distribution, what can you do? There are some changes that were built into the original Hearthstone system, like not having completely regular point cutoffs. For Hearthstone, the lowest ranks require 3 points to advance, while the highest ranks require 5 points. This helps spread out the lower ranks a bit more, but is not aggressive enough to really make the ranks match player expectations, and doesn’t address the fundamental problem that points are only created at the bottom of the ladder, and must funnel all the way up the ladder for more players to reach the highest ranks.

The big change I’ve noticed in Magic Arena is that there are way more instances of net positive point distribution. Hearthstone originally only had two sources of net point increase: a player losing at the bottom of the ladder when they couldn’t actually lose a point (but their opponent still gained one) and “win streaks,” where 3+ wins in a row would lead to more than one point per win. (Technically a ladder player beating a Legendary player also resulted in a net gain in points, but this would probably cancelled out by ladder players losing to Legendary players.) In Magic Arena, they added multiple “checkpoints” throughout the ladder, so that once you achieve a certain rank you can’t fall below it. These artificial floors add more net points to the system, since losing at the bottom of the ladder OR any checkpoint will result in a net increase in points (note: I believe Hearthstone implemented a similar change after I stopped playing). The other change is that in lower ranks, every single game is net positive: the winner gets two points and the loser only loses one. Between these changes, I suspect that the ranking distribution for Magic Arena players is much closer to player perception than it was for the original Hearthstone ladder.

All this discussion of how to specifically craft this card game ranked ladder system so that it matches player perceptions/expectations begs the question: why bother? Why not just adopt the Elo rating system used in other games? If you asked the developers of these games, I imagine that they would make an argument about transparency, since it’s much clearer what a player needs to do to advance: “win X more games than I lose”, compared to “gain X Elo rating points (which are often hidden from the player)” in other games. But I believe the real reason is to drive participation. By starting players at or near the bottom of the ladder every month, it forces them to play a decent number of games every month in order to achieve the rank they’re striving for. Free to play games (which basically all of these cards games are) need high participation rates to survive, and these ladder systems allow playing to always be associated with advancement (if perhaps in a more Sisyphean way than the developers would admit).

Leave a Reply

Your email address will not be published. Required fields are marked *