Perception vs. Statistics on the Hearthstone Ranked Ladder

When Hearthstone, Blizzard’s take on a collectible card game, first came out, I was hooked. It tapped into my nostalgia for all the Magic: The Gathering I played when I was younger, but offered a much smoother playing experience than what was available for other online “card” games at the time. I played for a few years, but eventually grew bored of Hearthstone. Now I’m back on the online CCG bandwagon with the release of Magic: The Gathering Arena, the new Hearthstone-ified version of the original TCG/CCG.

Playing an online CCG again reminded me of one thing that never really made sense to me about Hearthstone: the way its ranked ladder worked. Most competitive online games use some variant of the Elo rating system. In such a system, each player has a number which corresponds to their skill level – whenever they win a game, that number goes up, and whenever they lose a game, that number goes down. The magnitude in the change of Elo rating is equal for the two players involved in the game (e.g., if I lose to you and consequently lose 5 points, that means you would gain 5 points) and depends on the difference in rating between the players. For example, if a player beats a much lower rated player, they will only get a small increase in rating, but if they lose to that much lower rated player, they will experience a large decrease in rating. Typically this Elo rating is then translated into a rank/tier/division, which broadly groups players by skill level. For example, in Rocket League, the lowest ranked players are Bronze, followed by (in ascending order): Silver, Gold, Platinum, Diamond, Champion, and Grand Champion.

Hearthstone, Magic Arena, and most online “card” based games don’t seem to follow this method, at least not until you get to the very top tier of the ladder (e.g., “Legendary” for Hearthstone, or “Mythic” for Magic). For the portion of the ladder relevant to most players, you start at (or near) the lowest rank at the beginning of every month, a win gets you one “point” (the exact terminology varies game to game), a loss loses you one point, and there are regular point cutoffs associated with each rank (e.g., each subsequent rank requires 5 more points than the last one). There are some specific caveats to each game, but that general system forms the foundation of the ranking system.

At the surface, it might seem like there’s not really a difference between these two systems. Both systems are zero-sum: any points lost by one player are gained by another. And assuming there are enough players so that you’re always playing someone of roughly your same skill level, then the importance of each game is about the same. What is different is the shape of the distribution of player skill ratings, which matters because in the card game system, this distribution is fundamentally connected to player ranks.

In the Elo system, the distribution tends to resemble a Bell curve, since each game should have roughly 50% chance of going either way, like a Galton Board. However, this distribution isn’t particularly important, since the developers can set arbitrary cutoff points for each rank. Given that the Gold and Platinum ranks are in the middle of the possible Rocket League ranks, players expect that an average player would be Gold or Platinum, and the Rocket League developers can set the ratings corresponding to Gold and Platinum to be in the middle of the distribution. If they found that players were happier when they were classified as a higher rank, they could arbitrarily shift the ratings so that Diamond corresponded to the middle of the distribution, without changing how the Elo rating system works overall.

The card game system would also resemble a Bell curve if it was truly zero-sum, but at the bottom of the ladder it is not actually zero-sum. Someone with zero points can’t lose points, so games at the bottom of the ladder result in a net positive production of points. By preventing players from going below zero points, the ranking distribution looks asymmetric and long-tailed, like a Pareto distribution, or a Chi-squared distribution with only one or two degrees of freedom. (Since this can be modeled as a diffusion process, I think of it as being like the temperature distribution in the heating of a semi-infinite solid.) The shape of this distribution wouldn’t matter if the developers set ranking cutoffs arbitrarily, but by keeping the point cutoffs regular or mostly regular, this long tailed distribution in points leads to a long tailed distribution in player ranks as well.

Since I’m sure most players don’t spend as much time thinking about distributions as I do, the card game ranking system becomes problematic because the perception of players’ rankings greatly differs from the reality of the ranking distribution. Back when I played Hearthstone, the effective lowest rank (corresponding to zero points in my above description) was 20, and the highest rank was 1. Consequently, a player might reasonably assume that an average skill level would get them to rank 10 or 11, when in fact, due to the skewed distribution, the median player might be at only rank 17 or 18.

So if you’re committed to this “card” ranking system, with its skewed distribution, what can you do? There are some changes that were built into the original Hearthstone system, like not having completely regular point cutoffs. For Hearthstone, the lowest ranks require 3 points to advance, while the highest ranks require 5 points. This helps spread out the lower ranks a bit more, but is not aggressive enough to really make the ranks match player expectations, and doesn’t address the fundamental problem that points are only created at the bottom of the ladder, and must funnel all the way up the ladder for more players to reach the highest ranks.

The big change I’ve noticed in Magic Arena is that there are way more instances of net positive point distribution. Hearthstone originally only had two sources of net point increase: a player losing at the bottom of the ladder when they couldn’t actually lose a point (but their opponent still gained one) and “win streaks,” where 3+ wins in a row would lead to more than one point per win. (Technically a ladder player beating a Legendary player also resulted in a net gain in points, but this would probably cancelled out by ladder players losing to Legendary players.) In Magic Arena, they added multiple “checkpoints” throughout the ladder, so that once you achieve a certain rank you can’t fall below it. These artificial floors add more net points to the system, since losing at the bottom of the ladder OR any checkpoint will result in a net increase in points (note: I believe Hearthstone implemented a similar change after I stopped playing). The other change is that in lower ranks, every single game is net positive: the winner gets two points and the loser only loses one. Between these changes, I suspect that the ranking distribution for Magic Arena players is much closer to player perception than it was for the original Hearthstone ladder.

All this discussion of how to specifically craft this card game ranked ladder system so that it matches player perceptions/expectations begs the question: why bother? Why not just adopt the Elo rating system used in other games? If you asked the developers of these games, I imagine that they would make an argument about transparency, since it’s much clearer what a player needs to do to advance: “win X more games than I lose”, compared to “gain X Elo rating points (which are often hidden from the player)” in other games. But I believe the real reason is to drive participation. By starting players at or near the bottom of the ladder every month, it forces them to play a decent number of games every month in order to achieve the rank they’re striving for. Free to play games (which basically all of these cards games are) need high participation rates to survive, and these ladder systems allow playing to always be associated with advancement (if perhaps in a more Sisyphean way than the developers would admit).

My progress during Season 3 of Rocket League

It should be readily evident from some of my other posts that I love Rocket League. I started playing ranked matches during Rocket League’s Season 3 (which ran from June 2016 through March 2017), and towards the end of August 2016 I started tracking my progress in a spreadsheet. I kept track of my skill ratings (similar to an Elo ranking) in the different playlists using a stat tracking website and once a week I would run through each of the default all-star trainings five times as an alternative metric for mechanical skills.

I had originally had very ambitious plans for what I would do with the stats I collected, but since it’s been two months since Season 3 ended I wanted to post something before the project lost all momentum. Perhaps I’ll revisit this in more detail later, but for now the project has culminated in a simple page posted here which lets you explore different ratings/rankings over time or number of games played.

There are a few points I found interesting from exploring the data:

  • My unranked rating (rating from playing in casual playlists) stayed more or less constant over all of Season 3, even though I undoubtedly improved over that time period.
  • Looking at a playlist rating vs. number of games played (e.g., Standard rating vs. Ranked standard games played) I hit a plateau in Duel and Solo standard, but not in Doubles or Standard. Perhaps through the latter half of Season 3, my teamwork improved more than other mechanical skills, which doesn’t show up as much in Duel or (unfortunately) in Solo standard.
  • I plateaued fairly quickly in Keeper and Striker training, while I continuously improved in Aerial training. Keeper plateaued because I quickly approached 100% completion, whereas Striker seemed to always hover around 70%. I’m not sure why I’m not better at Striker even after spending so much time on it.

Rocket League Halloween Costumes

This Halloween my girlfriend Jaimie and I dressed up as Rocket League cars. She doesn’t play, but I was set on the costume and she wanted her costume to match mine:

rlhallow2

We made them ourselves, so ideally I would’ve thoroughly documented the process, but we were a bit rushed for time, since we just worked on them during evenings of the week leading up to Halloween. Instead I just took pictures intermittently, which you can see below. As you can see, the cars are primarily made of taped together cardboard that we painted. The wheels and tires are attached to the main body with toothpicks and superglue.img_20161026_212930616 img_20161027_205000179 img_20161027_221904023 img_20161027_221900556 img_20161027_221911998 img_20161029_151353858

Rocket League: the most authentic soccer video game ever made

A few weeks ago, at the recommendation of a friend, I started playing Rocket League. In Rocket League you control a rocket powered car in a game resembling indoor soccer: you score points by knocking a ball into the opposing teams goal, and you try to prevent the other team from knocking the ball into yours. While I’m late to the party (Rocket League has been out for a bit over a year now), in these past few weeks it’s completely won me over: right now I’d say I’m spending the majority of my free time playing Rocket League.

I’ve also been playing some actual soccer this summer, and after playing soccer one day I realized why Rocket League is so compelling to me. Rocket League is the most authentic soccer video game I’ve ever played. A natural response might be: “That’s ridiculous! How could a game about cars playing soccer be more authentic than games like FIFA, which are about the actual sport of soccer?” I would not contest that watching a game played out in FIFA is more like watching an actual soccer game, but playing Rocket League captures the feeling of actually playing soccer better than any other video game I’ve played. I am not the first person to share this opinion, but nonetheless I thought I’d take this entry to explain my reasoning in a bit more depth.

The first reason is that the mechanics of Rocket League make the experience much closer to playing soccer. In an actual soccer game, when you want to shoot, pass, or clear the ball, all of these are accomplished by kicking the ball with your foot (well, usually contact is with the foot, but occasionally it could be with the leg, head, etc.). In FIFA-like video games, you perform the different kicks by pressing different buttons, e.g., [x] to pass or [y] to shoot, even though an actual soccer player kicks the ball in both cases, just in a different way. In Rocket League, the “kicks” are collisions, and all the contact between the car and the ball is physically simulated in the game. So whether you want to pass or shoot, in both cases you need to drive your car into the ball. Whether it’s a pass or a shot depends on how fast you strike the ball and at what angle, there’s no pass button you can press that will cause your car to knock the ball towards a teammate.

The result of these mechanics is that when you’re bearing down on a ball in front of the opponent’s goal in Rocket League, it feels like running up to shoot the ball in actual soccer. In both cases you’re acutely focused on how to strike the ball to maximize the chance of it going into the goal. In FIFA, on the other hand, you just press the “shoot” button and hope for the best (I’m probably underselling the control in FIFA a bit here, but you get the idea). Likewise, in Rocket League when you’re trying to clear a ball in front of your goal, you’re trying to get some part of the car between the goal and ball (similar to real soccer, but replace car with leg), not just mashing the “clear” button in the proximity of the ball.

The other big reason that Rocket League recreates an authentic soccer experience is the perspective. While not technically played in a 1st person view, Rocket League is 1st person in the sense that you only control one car, and your field of vision is limited to originating from around where that car is. This is different from FIFA-like games, which let you swap control of players so you’re typically in control of the player with the ball (on offense) or the player challenging the ball (on defense), and are viewed from a perspective similar to how professional soccer games are broadcast, so you can see all the relevant action around the ball.

In a real soccer game, what you do off the ball is very important: your position is critical to creating scoring opportunities for your team and shutting down chances for your opponent. This aspect of team play is largely out of your control in FIFA-like games: what the other players do on your team is automated. In Rocket League, since you only control your own car, you need to decide where to position yourself for the entire game, not just when you’re on the ball. And similar to real soccer, this positioning is very important: if you play like a 5-year-old plays soccer and just chase the ball the whole match, your team won’t have the structure to both attack and defend, and you’ll doom yourself to many losses.

The view that Rocket League is played in (psuedo 1st person) also makes the experience more authentic. In real soccer, situational awareness is an important skill to have, which is recreated in Rocket League by necessitating that you pan the camera up and down the field if you want to know where your teammates and opponents are. In FIFA-like games, the positions of everyone is simply laid out for you by the high viewing angle of the field. The Rocket League view also means you need to actually be able to judge distances and trajectories to know where you should head to intercept a ball or player (as in real soccer), whereas FIFA-like games put a glyph on the ground where the ball is going to land, since it would be intractable to assess a ball trajectory from the high viewing angle they use.

To me, this combination of mechanics and perspective end up making Rocket League so much more fun because it feels like actually playing soccer. In Rocket League it feels like you scored the goal, whereas in FIFA it feels like the character in the game scored, you just happened to tell him to shoot.