Mathematically Transforming a Biased 6-Sided Dice Into an Unbiased 20-Sided Dice

2024-04-05 - [53] 1:16

Back in 2017, I learned how to convert coin flips from a biased coin into fair coin flips using a process created by John Von Neumann to debias random binary data. For about 7 years, I tried to figure out if there was a way to expand that to biased dice of sizes larger than 2. I didn't end up figuring out a direct method, but while trying to figure out how to program a shuffler for a deck of cards on the Nintendo Entertainment System in 6502 Assembly, I learned how to convert random bits to random numbers within arbitrary ranges. This allowed me to find a solution to convert rolls of a biased dice of one base to an unbiased dice of another base. This probably isn't the optimal solution, but it is one that is simple enough for me to calculate with pencil and paper or even often times in my head.

Scenario

You are given a weirdly shaped and weighted 6-sided dice. All rolls from that dice are independent from each other, as you would expect from a dice, but the results are incredibly biased towards particular values and against other values. You don't know exactly how biased the dice is, but you can roll the dice as many times as you want. Using only results of the biased 6-sided dice, simulate a roll from a fair 20-sided dice.

Technical Definition of Problem

We have an N-sided dice that is able to roll at least 2 results, although we don't know what at what rate. These results MUST be independent of each other. The same N-sided dice MUST be rolled each time. Dice rolls MUST NOT influence future dice rolls. We need to emulate an X-sided dice where X is at least 2.

Solution

Step 1: Convert your dice into a coin

First we take the input dice and convert it into a potentially biased coin. How do we do that? Roll the dice until you get 2 different results, then pick one of the results you got. That can be your "Heads" result. The reason we wait until you get 2 different results is so you know more than 1 result is possible on the input dice. You now have a likely biased coin. For instance, if you rolled the 6-sided dice 4 times and got the results of 1, 1, 1, 4, then you can pick either 1 or 4 as your "Heads" result. For this example, let's pick 1 as the "Heads" result. Now any time you roll a 1, think "Heads" and any time you roll something else, think "Tails".

Step 2: Debias the coin flip results

For this step, we use the Von Neumann debiasing step to "coin flips". Do 2 "coin flips" with the dice as explained above. You now have 4 possible results, which are "Heads Heads", "Heads Tails", "Tails Heads", and "Tails Tails". If your 2 results are the same, throw out the results and do 2 more "coin flips". Keep doing the 2 "coin flips" until you get 2 different results. At this point, you'll have either "Heads Tails" or "Tails Heads". The result for the debiased coin flip will be whatever flip happened first of the 2 flip set. If you got "Heads Tails", then "Heads" is the debiased result, otherwise if you got "Tails Heads", then "Tails" is the debiased result. The important thing is these 2 "coin flips" have to be seperate from every other 2 "coin flips", so no sharing any flip results among the sets of 2 flips. If you don't get 2 different flip results you MUST discard the 2 flip results and do another 2 flip results.

Step 3: Dice Rolling Algorithm

Before explaining this step, I wanted to mention that I am currently unable to find the source document with this algorithm, but I believe it was published by Donald Knuth and Andrew Yao in 1976. Some variables will be different than the source algorithm, but the logic will be functionally the same.

For this algorithm, you will need 2 variables, A and B. You will also need to know the amount of sides for your output dice roll, which in the scenario near the top of this post is 20. Set A to 1 and B to 0.

A = 1
B = 0

We will also have a function called "flip()" that will perform a debiased coin flip from the biased input dice, as explained in steps 1 and 2. If flip() results in "Heads", then flip() will equal 0, otherwise if flip() results in "Tails", then flip() will equal 1. In summary, a fair heads = 0 and a fair tails = 1.

while true {
  A = 2 * A;
  B = 2 * B + flip();
  if (A >= sides) {
    if (B < sides) {
      break;
    }
    A = A - sides;
    B = B - sides;
  }
}
return B;

B will ALWAYS be a smaller value than A throughout this loop. B will also be a random value that has equal chances of being any value from 0 to (but not including) A throughout this loop. When A becomes the same size or larger than the amount of sides of the output dice, B will either be smaller than the amount of sides or it won't be. If B is smaller than the amount of sides, then it has an equal chance of being any value from 0 to (but not including) the amount of sides. If B is not smaller than the amount of sides, then B overshot the amount of sides by a value from 0 to (but not including) A minus the amount of sides, where each value of that is equally as likely. That overshoot is applied to B by subtracting the amount of sides from B and also by subtracting the amount of sides from A. This loop ends when A is greater than or equal to the aount of sides AND B is less than the amount of sides. The end result is the value B, which as explained a few sentences ago, has an equal chance of being any value from 0 to (but not including) the amount of sides. This algorithm is most efficient with the amount of sides is a power of 2, like 2, 4, 8, 16, 32, 64, etc...

This algorithm will generate a uniform number from 0 up to (but not including) the number of sides of the fair output dice, so to get a dice roll from 1 to (and including) the number of sides, just add 1 to the result. That means if you get a 0 from the algorithm, the 20-sided dice roll would be 1, and if you get 17 from the algorithm, the 20-sided dice roll would be 18.

Put together, the dice rolling algorithm is as follows:

function roll(sides int) {
  A = 1;
  B = 0;
  while true {
    A = 2 * A;
    B = 2 * B + flip();
    if (A >= sides) {
      if (B < sides) {
        break;
      }
      A = A - sides;
      B = B - sides;
    }
  }
  return B;
}

Ending Thoughts

There are likely better ways to convert a biased dice into an unbiased dice than what I wrote here. Part of a better solution is to pick the "Heads" and "Tails" values of step 2 more equally by finding combinations of dice rolls that are just as likely as each other, for instance, deciding that 1 or 5 is "Heads" and 2, 3, 4, and 6 is "Tails" for a biased dice that rolls either 1 or 5 about half of the time compared to any other number. The important part is that you can convert a dice of size N to a simple "True" or "False" statement that is independent from previous or future rolls and is mutually exclusive. In this example, a dice roll can't be "1 or 5" AND "2, 3, 4, or 6" at the same time and due to the rules of the input dice needing to not be influenced by previous rolls, we are able to use this "True" or "False" statement as a means to get a biased coin flip result.

If the input dice is fair (as in the results are not influenced by previous rolls and all values are just as likely as all other values for any roll), you can just skip step 2 entirely.

I appreciate that this method is simple enough computationally do to with pencil and paper and without a computer. When I go out in general, I tend to bring a coin with me to use as a 2-sided dice for whenever I need to generate random numbers for whatever reason.

Something important to mention is that my definition for the input dice is flawed. I mention that the same input dice MUST be rolled each time but each roll MUST NOT be influenced by previous rolls. Unfortunately we live in the physical world where matter and chemistry impact other matter over time, so each dice roll will end up shaping the dice (flattening, chipping, etc...) in incredibly small ways that will change the result rates of the dice over time. We just ignore that and assume the dice is consistent every time. We basically handwave away the complications and assume the input dice is a "sphearicle cow in a vacuum".

[HTTPS] Sphearical Cow - Wikipedia

Because the dice changes in the physical world over time, I have decided that a little bit of wiggle room is fine in the case where I need to generate a random number but I don't have a coin or dice. The method I use if I don't have a coin or dice is looking at a poster or thing with words on it, then starting from the first word and counting the amount of vowels in a word as a coin flip. If the word has an even number of vowels, the result is "Heads", and if the result has an odd number of vowels, the result is "Tails". I assume the even/odd parity of vowels is correlated to the length of the word and that the length of the word is probably correlated to the length of the previous word or words, but I also assume it doesn't make enough of an impact to matter much, especially when I do the Von Neumann debiasing step. I'm essentially doing a manual hashing algorithm to the text and using the resulting hash as a random number within a given range.