Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To start with, only 7 of the 26 letters (uppercase and lowercase letters are only different in the first digit, which is never higher than 7 in ASCII, so I'm ignoring the difference) in the alphabet have an A-F in one of the digits, so in a text composed only of letters with all letters being equally likely, the frequency of A-F would be 26.9%. Additionally, A-F occur disproportionately in infrequently used letters. Adjusting for the frequency of letters in English[1], the expected frequency of bytes with A-F is 21.7%.

The space character also doesn't have an A-F, so that will lower the frequency further. If we assume that the frequency of spaces is the same as in "America Can Code ", (17.6%) the expect frequency of A-F comes out to 17.9%. That's very close to the number in "American Can Code " (17.6%).

[1] http://en.wikipedia.org/wiki/Letter_frequency#Relative_frequ...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: