Friday, July 16, 2010

Cribbage Pro Shuffling & The Deck

So lately it seems popular to dig on the shuffling and claim that we are stacking the deck in the game to favor the computer player.  I'm not going to go into the "Hey, your game cheats" thing again and why people like to think the computer is cheating when they lose, and I'm not even going to go into how the game does it's dealing "every other" and so it could not favor one player over the other.  However, I do want to give everyone some actual data points for the discussion since it seems lots of people are excited about it.  So, for this post, here is some information you may find interesting.

As was mentioned in our last post about the latest release, we use a random number generator in single player mode to randomly pick the location of a card in the deck - so as to shuffle it.  In multiplayer it is from a truly random source, but in single player we use a "tried and true" pseudo-random source based on the "Mersenne-Twister" algorithm.  So what does that really mean though, what is the result of doing that and what does it look like?  How fair is it really?

Let's define some terms first.  We will call "fair" something that in all measurements possible seems to be as truly random as possible.  You may have a different opinion (like it should match a real world shuffle as strictly as possible), but we will not go there for this as there is so much debate about that single point we will simply not solve it here.  The second definition is around a "shuffle".  For this argument, let's agree to define it as the process used to randomize the deck.

So, without further delay, here is the code that does our "shuffle" in the game today:

            rand = new MersenneTwisterRNG(new SecureRandomSeedGenerator());
            //Loop through each card giving it a random value
            for (int n = 0; n < cards.length; n++) {
                cardList.add(new Card(cards[n], rand.nextDouble()));
            }
            //Sort by that random value
            Collections.sort(cardList, new CardSortByKey());
            //Store back into the byte array
            for (int n = 0; n < cards.length; n++) {
                cards[n] = (byte)cardList.get(n).getOrdinal();
            }

There you have, it.  That is all there is to it and it is a direct copy/paste out of the game with comments and all.  If you can't read it, what it is doing is first getting that random number generator I mentioned earlier, the MersenneTwister.  It "seeds" it with the same kind of source that is used for security on your phone (that SecureRandomSeedGenerator makes that happen).  It then takes random numbers from that and assigns one to each card, and then it simply sorts them by that key and stores each card in that location.

So that is the shuffle, but what does the result look like?  Well, one of the best ways we have found to show this is in what is called a scatter/plot graph.  Now we could show you every single card and where it landed, but trust me, you could not read that graph/chart.  So what we will show you is what we will call "marker cards" in the deck and where they ended up after the shuffle.  We chose the "ace" card from each suite for that purpose, but we could just as easily choose anything else, it was just convenient to do that.  The data shown here comes from thousands of real games played - not just us running this in some test area - these are real decks used in real games that some of you may have played.  We got the data from a set time period and stopped at 3,000 decks.  Going anything beyond that and it gets very hard to visualize, but we could of course go to many, many more.

OK, you have been waiting patiently.  So here is the first graph:
This graph is for the ace of hearts (AH) and it is showing for 3,000 decks, where that AH was in the deck from position 0 (top of the deck) to position 52 (bottom of the deck).  As we progress this forward, what you end up getting is what really is a very random distribution.  If there was an inequality, it would stand out as a cluster surrounded by a lot of blank space - and you really just don't see that in the data.  Sure, there are some spots that are darker the others and some not as much, but in the end if you zoom in even, you will find that distribution is quite random from one deck to the next.  Just for informational purposes and to complete this thought, here are the other cards as mentioned:




So that is the distribution of cards in the deck as given by sample marker cards in the deck.  You will see that none of them look the same, this is a good thing, but at the same time don't get too into these and try to read much into them either.  I show it here to say simply this - it is clearly not favoring any deck position over any other.

There is one other way that we have found somewhat helpful in determining not just how distributed a card might be from one deck to the next, but generally how "well shuffled" a given deck might be.  To do that, you can consider looking at what the relative position of a card is now against the card that is positioned next to it in the deck and it's original position.  In other words, two cards may have started as being 1 position away from the other card, but how far apart are they now?  So if you give each card a starting point form 0 to 52 and then you shuffle and see how far apart they are from where they started relative to each other you get an interesting picture.  Of course the theory here is that the more that the results show some random separation, then it could be considered a good shuffle.  There is no "gold standard" we have found in this regard, and maybe we are the only ones looking at this.  Anyway, here is our data of a deck we grabbed from one at random:



Here it is averaged over 5 decks:



Now over 10 decks:



Do  you see a trend?  Here is at the 3,000 mark:



Yeah, we think it looks cool too.  It is almost smiling at you.  In fact, I think it looks like a really good distribution in a good random shuffle.  Things settling towards being equally distributed in the deck on average - meaning each has a decent chance of showing up just about anywhere.  If you consider the possible combinations of a deck, the more you approach that factorial, the more this graph will get flat.  That is exactly what should happen.  I think that is cool.

Feel free to add your interpretations in the comments or ask for other views of the data.  If you have some cool graph you would like to see (and you can tell us how to construct it somewhat), we will do what we can to give it to you.

In summary, we think this shows two very good perspectives on the shuffle and distribution in the game.  We hope you found it interesting and informative.  We think it fairly clearly shows the unbiased nature of the game and the random nature of it's results.  I'm sure some will disagree, but if you do disagree, please let us know why and what you may find more compelling or if there are parts we did not discuss that you would be interested in knowing more about.

-- Josh