A huge amount of discussion about the game XCOM: Enemy Unknown pertains to the random number generation. Many people claim — either seriously, or in jest because they are so frustrated with their luck — that it is broken.
Because I’m completely hooked on the game For science, I’ve been playing a lot of XCOM, and I have been recording my shots as I played. For every shot I actively took, I recorded the displayed chance of it hitting, and whether it actually hit or missed. (I ignored overwatch shots because I couldn’t see their probabilities, and also didn’t bother with rockets and other later-game non-gun weapons etc.) I’ve recorded over 1200 shots, and in this post I’ll examine the data to see if XCOM is fair.
The Psychology of XCOM
Keeping this record of hits and misses in XCOM taught me a lot about the psychology of playing the game. The longest streaks of hits that I noticed in the data was one streak with an incredible 18 hits in a row, and another with 19 hits in a row, with the following percentage chances to hit:
Streak 1: 65, 93, 85, 97, 100, 100, 73, 100, 95, 73, 57, 73, 86, 89, 94, 96, 81, 82
Streak 2: 95, 63, 94, 95, 73, 58, 100, 100, 100, 86, 84, 95, 98, 85, 84, 73, 100, 90, 95
What’s interesting is how I felt while playing the missions. I wasn’t sitting there shouting “amazing!” as hit after hit piled on. Both of the above streaks came in “very difficult” terror missions. In the first streak, all the aliens appeared on one turn, and I couldn’t kill the Chrysalids faster than they were turning the civilians into zombies. I downed 20 enemies in all, and lost 3 of my 6 men because I got so overwhelmed. So even when I had amazing positive luck with my shots, I didn’t even notice until I took stock of the spreadsheet after the mission.
On the other hand, here’s a (much more likely) streak from the third game I started, from the first mission, where I missed my first six shots with percentages:
45, 45, 54, 45, 45, 45
I ended up losing two of the four men, and since it was the first mission, I restarted in disgust. You really do notice only the negative streaks, and never the positive streaks.
Below is the best graph I could think of to represent the fairness of the random number generator. On the X axis is the stated chance to hit: the number that popped up in the box when I took the shot. I’ve grouped these into 5% bins, and then plotted a bar for each bin, showing how many of those shots actually hit. In an ideal world, with infinite data, the red line on the graph would pass through the tops of all the bars. It would actually be surprising if this happened exactly: it’s called random for a reason, and with the small amount of points I have in each bin (around 60), it’s likely that the proportions I observed are not perfectly in line with expectation.
That looks reasonably fair to me. I think that graph is the most comprehensible output you’ll get, but “looks about right” is not very scientific! Read on for a more precise methodology.
The problem with determining whether something is truly random is that you can never be sure. Theoretically, any string of hits and misses is possible in XCOM (except 100% shots missing), so you can never know for sure if it was a broken random number generator or bad luck. The best you can do is collect a lot of data, and see if it’s an unlikely result, and then conclude whether you’re confident that the data came from a random generator.
Here’s the idea then behind testing for random generation. We pick the individual to-hit percentages, e.g. 65%, for which we have the most data (at least 20 shots). We then work out what the chance was of getting a result as extreme as the one we observed. If this chance is low (conventionally, 5% or less), the data is unlikely to have come from a random generator. For example, let’s say that we had fired 25 shots at 85%, and all of them had hit. The chances of this happening is only 1.7%, so unlikely if the generator was truly random.
However: one complication to this method is that if we check several percentages, we are likely to find one that’s extreme. On average, if we check 20 different percentages, we’ll find one that we are 95% sure is too extreme. This is known as a type I error (an awful name!). To control for this, we can use a procedure known as FDR, and your eyes are probably glazing over right now, so let’s get to the result.
The result is that, based on my data, there was no evidence (at the 95% confidence level) to suggest that the random generator is unfair. If you want to see the data and the working, it’s all available in this spreadsheet.
There a few caveats to note. One is that my significance testing is very underpowered: despite recording a lot of shots, I don’t have enough shots for specific percentages to be likely to spot any deviations that aren’t large. (Specifically: at 80% power, for 50 hits, I could only have spotted 20% deviations in the shot percentage.) More data would solve this problem! One alternative would be to use Bayesian methods, where I could express my prior belief that the generator is fair:
The other caveat is a potential problem with the data, caused my own lack of XCOM ability! In the spreadsheet, Game 1 through Game 5 are “Classic”-difficulty Iron Man games. I’d completed Classic Iron Man quite smoothly once already, and figured I’d be fine. Five failures later, and I was left with the problem that I didn’t have enough data for high-percentage shots, which tend to occur later in the game when your soldiers are high-rank — but I got all mine killed! So Game 6 was non-Iron Man, so that I could use a little reloading to carry me safely through to the later game. Shameful.
Reloading causes two problems with the data. One is that if you record some shots, reload, then record again, you’re effectively recording the same thing (due to way the state of the random generator is saved in the game). So I was careful to wipe out any data up until the point I was reloading from. The other problem is that this can still introduce a bias to the data, because you’re more likely to have missed when you trigger a reload, and more likely to have hit when you don’t then choose to reload. I did not reload too often (and usually it was an alien’s hit that caused a reload, not my own miss), but there is that potential small bias in the data for Game 6. I also haven’t tested autocorrelation (“streakiness”), mainly because I’m not sure how to do so on this kind of data. But overall, I’m fairly confident that XCOM is indeed random.
Comments on: "Is XCOM Truly Random?" (10)
Just wanted to thank you for your work here. I was curious myself about this but too lazy to go deep enough and do research. However it was very fascinating and entertaining to read this articel.
ps.: You got a very nice writing style, keep up the good work!
Your XCOM posts are great work! I’ve got you bookmarked now, keep it up!
Thank you for your science. Dr. Vahlen would be proud of you.
I wrote down a classic ironman playthrough with all mission outcomes (kills/wounds/promotions) and of course the final statistics. If you need it for more science, let me know.
I post it playthrough on a daily base under http://www.punchwood.com/index.php?/blog/857-fencenswitschens-blog/
you need an account there, however. (PaulSoaresjr makes funny videos, by the way.)
I was fairly confident of the random aspect for two reasons. Firstly, probably because I come under the Bayesian Statistician header and secondly because implementing the AI to provide an experience which does not use randomisation (or overrides it) is a) probably too costly to implement b) would require more processing power than the average PC or console has. i.e. the AI would have to be of equivalent intelligence of that of a “Dungeon Master” in a game of D&D – namely a human being. 🙂
You are officially my XCOM Math Hero. Thank you.
[…] I’ll be honest. I thought about writing such an article a few times, but I’m too lazy. Fortunately for all of us, this gentlemen blogger I just discovered is not only not-the-least-bit-lazy…he’s also damn smart. Mr. Brown over on the Sinepost has a series of articles (four at the time I’m writing this) discussing randonmess and probability in XCOM. If you have any shred of doubt in you that the game’s not being truly random, dig into this article where he recorded the results for over 1000 shots and graphed out observed vs….. […]
Did you ever try changing your plan when reloading? Through savescumming, you can clearly see that even though the RNG data is predetermined and unaffected by reloading, those numbers can be exploited to min-max your turn. For example, if the RNG data were something like “75, 50, 20” and you previously attacked with to-hit percentages of “40, 85, 80”, that would result in MISS, HIT, HIT. If you re-order those attacks instead to 80, 85, then 30, you will HIT all three.
I’ve written about savescumming here: https://sinepost.wordpress.com/2012/10/29/randomness-vs-canniness/ Yes, you can minmax your turn — and in fact, it gets a bit intricate, because if the RNG data is 75, 50, 20, then if the 75 is a hit, I believe the 50 will get used to decide critical, and then you’re left with 20 for the next shot. But if the 75 is a miss, the 50 will be used for the next shot (assuming there’s no other random events involved inbetween, like arc throwers, or AI decisions or whatever).
Interesting article. You said more data would be good, so if I may offer my contribution? I started to keep track of shots and how often they hit and missed, globally and per game.
Awesomely crazy post Neil. By the way, the statistics jargon for this is calibration. You are trying to determine whether the XCOM hit probability is “well calibrated”.