Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> he can trivially test combinations of dictionary words in very short amount of time.

Explain the reasoning behind this, please.

Start with: You don't know the dictionary I used, but have to use one that seems 'good enough' (i.e. a superset of mine, if possible).

How many words are in there?

How many combinations can you create for 'two word phrases'? (You don't know the length of my phrase)

How many for three?

How many for four words?



In fact, even if the attacker:

1) Knows this scheme was used.

2) Knows the list of words you used.

3) Knows how many words you used.

You will still have that amount of entropy. Diceware entropy values are calculated with all three of these assumptions already made.


> Start with: You don't know the dictionary I used, but have to use one that seems 'good enough' (i.e. a superset of mine, if possible).

People are likely to use a standard English dictionary. In my experience (which is exactly within this field) people use a fairly tight subset of the English vocabulary.

So I would be quite happy to test for a dictionary of, say, 100,000 words and be hopeful of a good hit rate (note that XKCD says common words, which is easily missed)

Our software has a test (which runs about third in its list of tests) which does dictionary combinations up to three words (two words is quite a commonly used password based on our statistics) with a dictionary size of <s>175,000</s>17,500. (Edit: sorry, apparently it is an order of magnitude smaller, I checked with one of the engineers :)) This includes English words plus a few commonly used foreign/slang terms. The hit rate on this is fairly high.

(we crack document/windows passwords mainly)

You could of course choose deliberately obscure words to invalidate this - but they aren't so easy to remember (so people will tend not to).

If someone is going out of their way to secure a password, sure, you're going to hit a brick wall. But what every password scheme tends to forget is the "human factor" whereby people not concentrating on being secure will introduce attack vectors.


If the dictionary really has 100 000 words, you're looking down the barrel of 52 bits of entropy for a three word phrase

In a more likely dictionary of the 5000 most commonly used words in the English language, you still get a three word pass phrase of about 40bits of entropy. Make that a four word passphrase, and you're back up around 52 bits.


This is simply incorrect. If you assume you really do have 100 000 "characters" in your alphabet this is correct. However, your alphabet follows a certain pattern: It's English text.

At that point its easier to brute force the individual characters. English text has about 1 to 2 bits of entropy per character. Lets assume 1.5 bits per character on average. That means that to really get 52 bits of entropy for a 3 word phrase you need to have at least 35 characters in your 3 word phrase, or about 12 per word.

Your password is only as strong as the weakest link. In this case English is easier to brute force than your 100 000 character alphabet.

It's simply false to assume that "yes no one three" has 44 bits of entropy because it got randomly selected out of a 2048 word dictionary.


Yeah, no. If I have a dictionary of 100000 words, then each word represents about 17 bits of entropy. If I have three words, that makes 3 x 17 = 51 bits of entropy.


You deny the fact that English text can be attacked separately from your dictionary. English text is very predictable, for example e is much more common and q is almost certainly followed by u.

I'm not making this up on my own either. Please check out http://en.wikipedia.org/wiki/Entropy_%28information_theory%2.... Let me quote the important part: The entropy rate of English text is between 1.0 and 1.5 bits per letter,[1] or as low as 0.6 to 1.3 bits per letter, according to estimates by Shannon based on human experiments.[2]

Consider the following example: You have a wordlist of 100 000 words. It seems only normal that log(100 000)/log(2) is equal to 16.6 bits of entropy. Now consider you take three words out of that list completely at random. You get the words "a no we". Assuming 16.6 bits of entropy per word you do indeed have to search through a space of 49.8 bits but only if you attack that via the dictionary

It is clear that in this case you can do a different attack. Instead of brute forcing the words you can brute force the characters on their own with a search space of a-z and space. This equals log(27^7)/log(2) or 33.2 bits. A lot less than 49.8 bits estimated when only considering a dictionary approach. In reality English text is so predictable that you don't have to search even close to 33.2 bits of entropy if you brute force it with an algorithm that is aware of English text. Assuming Shannon's 1.3 bits per character estimate this password has 9.1 bits of entropy.

I understand that this is an edge case with very short words. But I choose that to try and show that there are other ways to attack the password by using a 27 character dictionary. This is cold hard math and therefore much easier to accept than the magic entropy estimation of englist text. Once you see that this way can reduce your entropy calculation it's not that hard to accept that there might be more ways to reduce the entropy ever further.


1) the predictability of the distribution of characters in the English language has nothing to do with this type of password - the symbols aren't characters, but words. 2) that figure of entropy per character of 1.3 bits per character only applies to English text, and the figure is low because there are a bunch of small words, like "and" and "the" that are regularly repeated. The entropy per character for words containing 6 letters or more, not arranged in sentences is a lot higher, like about double if I recall correctly. So sure, just as I can expect to get brut-forced if I choose a pin of 0000, I can get brute forced if I choose a passphrase of 'and the in'. Good luck forcing "queens examine faulty charges" though.


I am merely suggesting that the entropy can be less than what is estimated by looking only at the dictionary.

Re 1: words still consist of characters Re 2: Certainly correct, but to ignore the possibility of English words having less entropy than it appears at first is odd given the patterns English words often follow.

I'm interested in reading more about those entropy estimations, can you recall where you read about it? According to Applied Cryptography Shannon states that entropy per letter decreases as the text grows. Shannon estimates 2.3 bits per letter for chunks of 8 letters but it drops down to between 1.3 and 1.5 bits per character for 16 character chunks.

Applied Cryptography cites a paper by Shanon called "Predication and Entropy in Printed English" in the Bell System Technical Journal from 1951. I Have not personally read it yet but will try to find it in the near future.


The mistake you are making is that you keep on wanting to treat a passphrase made up of words as being a 'text' in the sense that Shannon was using, but it is not a text in the Shannon sense of the word.

Shannon was analysing real world messages to arrive at that figure, not a string of random words. Here's a way of thinking it through that should help you see the problem clearly.

Let's imagine that Alice has just used a dictionary to generate a passphrase. Furthermore, let's imagine that the dictionary in question is a collection of 6 character or longer words pulled from "Pride and Prejudice". I'm pretty sure Jane Austen tops the 5000 words needed for such a dictionary.

Now, in the simple example, let's imagine that the passphrase is one word long. Bob is an attacker that knows the dictionary, He will guess the word in a maximum of 5000 tries. After 2500 he will have a 50% chance of having found the word.

Charlie, a second attacker, isn't going to use words as symbols, he's going to try and brute-force the word just by throwing random characters at the problem. He doesn't know the length of the word, so he's going to have to try all lengths of the word starting from one letter and working up. I trust that you can see that Charlie is going to need a lot more than 5000 guesses to find the passphrase.

David is a bit smarter than Charlie. He decides to use a Markov Chain of 3 character length to generate his guesses, so that the generated passphrases start resembling English words. The Markov chain was trained on text from the Sydney Morning Herald. David is going to do a lot better than Charlie, but that Markov Chain has to be capable of generating all of the words in my original dictionary, plus a bunch of other words, plus a bunch of gibberish non-words. He's clearly going to take more attempts than 5000 to find Alice's passphrase.

Evan is smarter still. He trains his Markov Chain using only words that are 6 characters or longer, and furthermore, he increases the chain length to 6 characters. He also teaches his Markov Chain that word seperators exist, and the Markov Chain generator is reset when a new word is started. Now we're talking - Evan's system will produce very few strings that are not actual words, but it will generate a bunch of words that were in the SMH, but not in "Pride And Prejudice", so he's going to still need more than 5000 guesses to be sure that he's found Alice's passphrase, so he still hasn't done better than treating the words as symbols.

There is one Markov Chain attack that gets nearly equal performance. Take Alice's dictionary, use it to train a Markhov Chain generator that knows about word separators, and that knows that the words don't have any probability links between them. But now your Markov Chain generator has just become a fancy way of picking words out of the dictionary. In other words, it has degenerated to a dictionary attack, not a brute force attack, and a dictionary attack is just another way of saying "treat the words as symbols" which means we're back to my original entropy calculations as being the optimal way of determining the entropy of the passphrase.

As I said at the start, your error is that you're trying to use randomly picked dictionary words as an English text in the Shannon sense of the word. Hopefully the example scenario that I just ran through will help you see why the distinction is important.


I understand randomly selected words are not the same as the English text Shannon is talking about. However my point is that entropy may be lower than what it appears to be. I'm not saying it is 0.6 bits per character (or any other number). I'm saying that unless you words are long enough it is very likely that the entropy is lower than simply taking log(dictionary_size^word_per_passphrase)/log(2) The references to Shannon and Applied Cryptography were only made as supporting evidence that entropy of English text is lower than what it appears. I'm not claiming their exact numbers apply in this situation.

If we can't agree on that, the we must simply agree to disagree.


* I'm not claiming their exact numbers apply in this situation.*

You see, the thing is, you kind of are making that claim. If entropy goes out to 3 bits per character, the one feeble point that you did have gets blown out of the water. The fact that you don't seem to understand that indicates that you really need to go back and reread a few books on Information Theory.

Look, for your own edification, try and come up with a scheme that will reliably beat a dictionary attack in terms of the number of attempts needed before finding a password taken from the dictionary. You could even write a simple program to test the idea. Take a dictionary of 5000 words with a length of 6 characters or longer (much shorter words than what your theory suggests should be secure). Select 100 words from the dictionary at random, and then run any attack of your devising against those words. If you can reliably get a better average number of attempts than 2500, I'll concede the point.

Until then, I'm here to tell you that you don't understand information theory as well as you seem to think you do.


I actually didn't ignore the 'common' limitation (and didn't downvote you - I'm actually interested how you come up with that).

Follow-up questions:

- What are the first tests, before this 3rd that tests for words? I assume tests for passwords of the first/left variety in the comic? Aren't they cheaper?

- 'Up to three words' is reducing the exponent of possible combinations by one. Length/number of words is relevant

Edit: Another issue. You say 'people forget the human factor', while you, yourself, propose something like '4 times Hack News with substitutions' as better. How is that including the 'human factor'?


<scrubbed answer>

You know what; it's been so long since I played around with this stuff (it's even a separate company now, that we just consult for) that I'm way out of touch with my thought process :)

You're right; there is nothing particularly wrong with the suggestion that makes it intrinsically very weak for most uses.

I'd best stop commenting before I make a total mess :)

Sorry.


You can't test 175000^4 = 937890625000000000000 passwords.


Ahem. This is why I should check my numbers, I'm told it is a 17,500 word dictionary (and we check 3 re-combinations). Sorry about that :S


Ah ! Now, I agree that if everyone follow this advice, it's going to be fairly easy to crack every password based on a small set of simple words.


100,000 words x 4 words per password ~ 66 bits of entropy

XKCD suggests 2048 words x 4 words per password ~ 44 bits of entropy

It could be good enough depending on a use case.


The XKCD criticism of the 'bad' password has the same problems. How does the hacker know I have this kind of password (starting with a real english word, for example)?


No, it doesn't have the same problem.

It criticizes a particular way to choose passwords, leading to a result that seems 'secure' and even quite good to lots of people. One that easily satisfies braindead corporate password rules.

The entropy given there is based on that way to choose a password and even explained, graphically.

If you choose your password totally different and from different sets of characters, then the number will be off. That's not a surprise though?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: