Finally Found My Human Calculatable Hash Function

2024-06-12 - [53] 3:24

In my journal post "Hosts (Resolutions?) for 2024" from back in the end of December of last year, I wrote about how I was trying to find a human calculatable hash function to create a secure pencil and paper authentication scheme. Below is that journal post.

Hopes (Resolutions?) for 2024

After searching for quite a while, I found a system that should work well for me in a paper titled "Towards Human Computable Passwords" by Jeremia Blocki, Manuel Blum, Anupam Datta, and Santosh Vempala and published in 2017. The paper is linked below.

[HTTPS] Towards Human Computable Passwords

Motivation

I want to be able to update my gemini capsule and web server from another computer, for instance, a computer at a public library or a friend's computer if I don't have access to my own computers. At some point, I will probably add Titan support back to my gemini capsule server software "Bergelmir", which is fine if I'm using my own computers that have a gemini client like Lagrange, but I can't just download a "Small Web" browser on other people's computers. I also can't trust that there isn't some sort of malware or keylogger on other people's computers, so I don't want to be put in the situation where some malicious actor now has the ability to edit my website as they please.

Once my gemini client / web server has Titan support added to it, I could easily and pretty securely edit my site using a client TLS certificate to prove my identity, but I already have ssh access to my server, so I don't really need titan support.

I really would like to edit or add pages to my site from the public library computers though. This means I need to add some sort of authentication scheme to my website that only I can answer correctly. A shared secret a client TLS certificate to prove my identity, but I already have ssh access to my server, so I don't really need titan support.

I really would like to edit or add pages to my site from the public library computers though. This means I need to add some sort of authentication scheme to my website that only I can answer correctly. A shared password isn't enough for a secure authentication scheme though, as someone could look over my shoulder while I type in the password or the computer I am using could have a keylogger.

What I need is a human calculatable hash function. My server and I would both have a shared secret, then my server would give me random data (a challenge) that I would be able to manipulate in my head to produce a hash. The server would produce the same hash, and if my result and the server's result match, the server will know that I have the shared secret. The end result is that I have proven to my server that I am in fact me, and anyone snooping on the computer I use to edit my website won't know the shared secret or be able to successfully solve other random challenges.

Method

In the "Towards Human Computable Passwords" paper, they use base 10 numbers and a list of n images, all of which are associated with a random number and memorized by the human. It's not required to use base 10 or images though, so I used base 6 and a shuffled mapping of all 36 characters in my "bonewords" password scheme, so a random mapping of "abcdefghikmnoprswzAEFHLY234678+,-/?^" where each character is used one time each and each character is used, for example "i/c8-AEebgLsz+H2YFr6ph73^mdfk4,a?own".

{bonewords}

The details of the method I will be giving will be my particular method, as opposed to the base 10 method with images shown in the paper.

Generate Shared Secret

I personally want to be able to generate that random mapping without needing to use a computer. This mapping is equivalent to shuffling the 36 characters, so I am able to just use a deck of cards. Because there are 36 characters, I use the Ace through 10 of Clubs, Hearts, and Spades, along with the Ace through 6 of Diamonds. I then map Clubs to 1-10, Hearts to 11-20, Spades to 21-30, and Diamonds to 31-36. After I have those 36 cards, I thoroughly shuffle them by doing a bunch of riffle shuffled intersperced with overhand shuffles to change up the top and bottom cards, as riffle shuffles have a tendency of keeping the top and bottom cards near the top and bottom. How many shuffles should you do? Well, you could smush the cards around for a minute or a few minutes and square up the cards, or you could do like I do and do 2-4 riffle shuffles, followed by an overhand shuffle, followed by 2-4 riffle shuffles again, followed by another overhand shuffle, repeating that pattern until you have riffle shuffled 7-10 or more times. The goal is to just make sure the cards are good and shuffled without you being able to know where any of the 36 cards might be. Once you have the pack shuffled, map the cards so the Ace of Clubs = "a", 2 of Clubs = "b", 3 of Clubs = "c", ..., 4 of Diamonds = "/", 5 of Diamonds = "?", and 6 of Diamonds = "^", so the 36 bonewords characters in order. Now take the shuffled deck and write the characters in the order of the shuffled deck, for instance, in our example of "i/c8-AEebgLsz+H2YFr6ph73^mdfk4,a?own", the deck order would be 9 of Clubs, 4 of Diamonds, 3 of Clubs, ... 3 of Hearts, 7 of Hearts, 2 of Hearts. Write down that character order somewhere and memorize it. Give your server that information as well in a trusted computing setting.

Random Challenge

Now that you and your server have a shared secret, whenever you want to authenticate, the server will give you a randomly generated challenge. The challenge will be a random string of a particular size of characters that are found in the shared secret. This string is fully random, as in, any of the characters are allowed in any position of the string and repeats are also allowed. That means a challenge of all "a" characters is valid, although incredibly unlikely.

The random challenge string has 3 parts to it, 6 (or whatever base is being used) random characters as an index, then a set of random characters called k1, then a set of random characters called k2. The length of characters in k1 and k2 are important for security. The "Towards Human Computable Passwords" paper specifies that it will take approximately n^(min(len(k1)+1, (len(k2)+1)/2) correctly answered challenges before a polynomial time attacker would determine your shared secret. For a case where the length of k1 is 3 characters long and k2 is 7 characters long, it would take approximately 36^(min(3+1, (7+1)/2)) correctly answered challenges, which is equivalent to 36^(min(4,4)), which is equivalent to 36^4, which is 1679616. These numbers of course assume that this problem is an NP problem and that P != NP.

Let's use the challenge "36EmFpfpdp2-8fY4", where the 6 index characters are "36EmFp", k1 is "fpd", and k2 is "p2-8fY4". As a reminder, the shared secret is "i/c8-AEebgLsz+H2YFr6ph73^mdfk4,a?own", which we can split into groups of 6 to make our lives as people a little easier, so "i/c8-A", "EebgLs", "z+H2YF", "r6ph73", "^mdfk4", and ",a?own"

Step 1: The first step of solving the random challenge is to look at the values of k1 ("fpd") one at a time and find their position in the shared secret, then add those positions together and take the modulo 6 of the sum (use the answer 6 instead of 0 if the answer modulo 6 is 0). It's this modulo 6 that is the reason why we can split the shared secret into groups of 6 to make life easier. In this example, from k1, "f" is 4, because it's the 4th symbol in its group in the shared secret, "p" is 3, and "d" is 3. (4 + 3 + 3) mod 6 is 4, so our current value in solving the random challenge is 4.

Step 2: The second step of solving the random challenge is to take the result we got from the first step, and find that number character in the index characters ("36EmFp"). Our result from the first step is 4, so we take the 4th character of the 6 index characters, which is "m", and find its position in the shared secret modulo 6 (use the answer 6 instead of 0 if the value modulo 6 is 0). "m" is the 2nd character in its group in the shared secret, so our current value in solving the random challenge is 2.

Step 3: The third step of solving the random challenge is to look at the values of k2 ("p2-8fY4")one at a time and find their positions in the shared secret, then add those positions together along with the value of step 2 and take the modulo 6 of the sum (use the answer 6 instead of 0 if the answer modulo 6 is 0). The value of step 2 was 2. "p" is 3, "2" is 4, "-" is 5, "8" is 4, "f" is 4, "Y" is 5, and "4" is 6. (2 + 3 + 4 + 5 + 4 + 4 + 5 + 6) mod 6 is 3, so the final result is 3.

The answer to the random challenge is the result of step 3, so the answer is 3.

Counteract Brute Forcing

All of this work provides a single value from 1-6. That's trivial to brute force (just try 1, then 2, then 3, then 4, then 5, then 6... you're guaranteed to get one of those guesses right and by default have a 1/6 chance of guessing correctly). To prevent brute forcing, multiple random challenges are needed. The strength against brute forcing can be calculates as d^(challenges), where d is the base, and challenges are the number of challenges. For example, using base 6, as I have so far, and a 12 character authentication code (so 12 challenges) means a polynomial time attacker would need to guess 6^12, or 2176782336, authentication codes to guarantee they authenticate. There would be a 50% chance of guessing correctly within 1088391168 guesses in this case.

As long as the server changes up the challenges often enough, it would not be practical for a polynomial time attacker to guess the full authentication code.

Below is an example of 12 random challenges and their answers to produce a 12 character authentication code with the shared secret of "i/c8-AEebgLsz+H2YFr6ph73^mdfk4,a?ow":

36EmFpfpdp2-8fY4 = 3

pHp4Y2-4a6/,4/e, = 1

-E/kg8o7g2FLLH3z = 5

7niFH,Ygo^Ez4rz6 = 6

dYFdhd-sw6dnk-w^ = 6

/gkLshkzb,Hbpm+h = 5

4b4g-6egH+/4/-Fd = 2

p4cdk3eiFEYHYo6g = 3

^End^wiLi/Hrm2gh = 3

?HiFdararg7mnb^L = 2

gw,7L7zb,mY-pY/3 = 3

684^h-ezn-4H6bFd = 4

For this example, the authentication code would be "315665233234". I could either use "315665233234" or convert it via bonewords to "o7?ips", although converting it would likely take more mental capacity for me.

Initial Analysis

While I can't guarantee there isn't some weakness in this scheme, I can say that while running a few million random simulations, I get each result from 1 to 6 at virtually equal rates. The results seem heavily dependent on the randomness of the challenges.

As long as the paper is correct in the scheme's security and as long as I have a randomly generated 36 character shared secret, the amount of correct challenge answers needed to derive the shared secret will be dependent on the lengths of k1 and k2. In the case of the length of k1 being 3 and k2 being 7, I would need to correctly answer 1679616 challenges before my shared secret is essentially spent. With 12 character authentication codes, it would take approximately 1088391168 guesses to brute force the 12 character authentication code. Also with 12 character authentication codes, I divide the 167916 challenges value by 12, which means in my case I would be able to edit or add to my site 139968 times before my shared secret is essentially spent.

These numbers are plenty big enough for my needs.

Code Example

Here is a quick python script I wrote to generate 12 random challenges with the shared secret "i/c8-AEebgLsz+H2YFr6ph73^mdfk4,a?own":

import secrets

shared_secret = "i/c8-AEebgLsz+H2YFr6ph73^mdfk4,a?own"
auth_len = 12
auth = ""
k1_len = 3
k2_len = 7

for _ in range(auth_len):
    challenge = ""
    # Create challenge
    for _ in range(6 + k1_len + k2_len):
        challenge += shared_secret[secrets.randbelow(36)]

    step_1_val = 0
    for k1_val in challenge[6: 6 + k1_len]:
        # We add 1 to make value 1-indexed
        step_1_val += shared_secret.index(k1_val) + 1
    # Step 1 answer mod 6
    step_1_val %= 6
    # if step 1 value is 0, use 6 instead
    if step_1_val == 0:
        step_1_val = 6

    # Python is 0-indexed, so we subtract 1 to not overflow index
    step_2_char = challenge[step_1_val - 1]
    # We add 1 to make value 1-indexed
    step_2_val = shared_secret.index(step_2_char) + 1
    # Step 2 answer mod 6
    step_2_val %= 6
    # if step 2 value is 0, use 6 instead
    if step_2_val == 0:
        step_2_val = 6

    step_3_val = step_2_val
    for k2_val in challenge[6 + k1_len:]:
        # We add 1 to make value 1-indexed
        step_3_val += shared_secret.index(k2_val) + 1
    # Step 1 answer mod 6
    step_3_val %= 6
    # if step 3 value is 0, use 6 instead
    if step_3_val == 0:
        step_3_val = 6

    answer = step_3_val

    # Print challenge and answer
    print("%s = %d" % (challenge, step_3_val))
    auth += str(step_3_val)

# Print final authentication code
print("Auth Code: %s" % auth)