Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Human generated semi-random binary stream (zeroone.io)
32 points by levlaz on March 8, 2015 | hide | past | favorite | 42 comments


It would be neat to add some randomness-measuring statistics of the last N bits: like counts of 1s and 0s; digraph counts for 00, 01, 10, and 11; length of longest 0/1 repetitions; or more sophisticated tests.


My first thought along these lines is to also dump a whitened version of the provided entropy that might be more useful than the raw bits.


My thought also: pass it through an Von Neumann extractor and run the die hard test suite on it.


I will try this! Someone sent me an email proposing to run a carver on the stream.


So, this…

  3V›ÿÿª§ò…·wudÕLÜØE(‰@À‚ï¦ì€µQ$"`$´´$¤€ª @€az«hÀ€€@( @€ @ ¢KB€‚Bˆ@ Œ/
…isn't an "ascii stream". You can look at the source code for "covertBinaryToAscii", and it's really converting 8 bits of random data at a time into one of the first 256 Unicode code points.

(Also, you can avoid the need to escape special characters by using textContent[1] over innerHTML, although that comes with the catch of only being supported in IE 9 and later…)

[1]: https://developer.mozilla.org/en-US/docs/Web/API/Node/textCo...


It is ASCII (edit: ok, as per the replies, more like poorly-defined Extended ASCII since it uses 8 bits instead of 7). For every character that has an ASCII value, the Unicode code point and the ASCII value of that character are the same. Unicode was designed with backward compatibility in mind.

https://en.wikipedia.org/wiki/Unicode


What you say about Unicode and ASCII sounds correct. But many of those characters are not actually ASCII. See this table[0]; none of "ÿª§ò" are on it

[0] https://en.m.wikipedia.org/wiki/ASCII


I was going to say that you've pointed to the 7 bit ASCII chart and you want the 8 bit Extended ASCII chart. But the wikipage chart for extended ascii doesn't have those characters either so maybe it's extended ASCII and codepages?

https://en.wikipedia.org/wiki/Extended_ASCII

Weirdly, because neither chart has the smiley face characters I don't find them trustworthy charts. I am obviously wrong!


As the Wikipedia article states, there is no single thing called "the" extended ASCII. It is a nonstandard term that can refer to any of the multitude of 8-bit encodings that contain ASCII as a subset.


Heck, even UTF-8 could be considered "extended ASCII".


I played with this a few days ago when it was submitted as "Show HN" (but only got 3 points).

It was originally vulnerable to Javascript injection (with potential for XSS), since the "ascii stream" had no protection against arbitrary HTML that could be injected by binary-encoding it and sending it using a little script. I only had the time to PoC by injecting goat pictures: the next day, I tried making a more fleshed out potential-XSS demonstration and the author fixed the vulnerability while I was playing with it.

There's still a (disputable) glitch that I'd like to point out: the ASCII stream is different for everyone, since it's rendered client-side and simply uses the first available bit on page load as point of reference to know when a byte starts.

This "frameshift" glitch obviously marred my injection demonstration, but it's still somewhat annoying for anyone who wants to demo their l33t skills, as other viewers only have 1/8 chance of seeing any binary-encoded text that we send. So currently, the ugly way of forcing a message to be seen is by sending it 7 times, once for every possible alignment. (But why would anyone do that, right?)

Edit: It also looks like nothing is working anymore today.


Creator here -

There are lots of tiny bugs on this experiment. I actually started it as the regular node js chat tutorial, and figured out it would be fun to create a femto-blogging platform as you might see in the about page.

I'm still working on fixing bugs and improve rate limiting.


Neat. What would the output look like with some Von Neumann whitening in the mix? Also,

    <troll>
    in b4 someone writes a JS app that uses this to generate crypto keys
    </troll>


Someone must have written a script because it is almost all 0 now.


"almost all 0" is still random. :)


No it isn't. If you can predict with more than 50% what the next bit will be - it's not random.


Or you're lucky. One time a friend of mine was describing his fool-proof plan to win at Roulette. I jokingly asked, "What, double your bet when you lose?". He replied—in all seriousness—"No, triple it!"

I then argued with him about the money he would likely lose implementing his plan. He said, "what are the odds the ball will land on red five times in a row?". (We were ignoring the existence of the green 0 and 00). I took out a quarter, flipped it seven times, and it landed heads every time. This happened straight-away.

That was a random sequence, it was all 0s, and I'd like to think he was lucky that it happened that way, and convinced him to abandon his plan.

But I was also lucky. I had intended to demonstrate this, and was prepared to be flipping the coin hundreds of times until the run of 0s came up. You could say I was predicting the next result correctly 100% of the time on those first 7 flips. But my ability to predict the results didn't show their non-randomness, instead it showed my "luckiness". Which really means they weren't predictions at all, I guess.


That's called the Martingale betting system, and yeah it requires a gambler with infinite wealth


And a table with no maximum betting limit.


Just because it's biased towards 0 it doesn't mean there's no entropy in it. Even the raw output of an entropy source based on radioactive decay or thermal noise is biased.

To generate a highly random output that appears independent from the source and uniformly distributed a randomness extractor [1] has to beapplied. The most well know is the Von Neumann extractor.

[1] http://en.wikipedia.org/wiki/Randomness_extractor


You're talking about a cryptographically secure random. Normal random can really contain any sequence, including a repeating string of 1's or 0's. The infinite monkey theorem proves it :)


I'm of the opinion that any definition of random string that says the same string S is random or not depending on how we got it is just silly.

IMHO random string = string that has Kolmogorov complexity = length of this string.

Almost all strings we got from random variables with uniform distribution are random, but not all.


I was curious if there was any rate limiting/filtering being used... Since someone else noticed I guess there isn't!


I started implenting one but it's not behaving properly apparently. Working on it right now...


I was wondering how long it would take for someone to do that :)


this is the least random "random" binary stream I've ever seen in my life. at the moment it's being gamed and people are submitting 99% 1's. (it looks like 111111111111111111111111111111110111111111111111111 with just a few 0's thrown in.)

it would work a lot better if the server made a pseudorandom bit that it sent to the user for the user to xor their choice by on client-side, and then out of spite undid that bit in half the cases on server-side.

edit: screenshot - http://imgur.com/TLUN8AT


I wonder how close to true random we would approach if several people had automated scripts sending 1's or 0's to the script?

What about if they were sending a random string themselves?

What if they were sending a script that analyzed the last so many entries and tried to provide the opposite, to have some particular desired ratio?

I wonder, given latency and other network features, could this ever achieve true randomness, despite being generated from actors with an agenda?


Make sure you scroll down to view the ASCII and hex streams too. Hopefully someone will be encoding something there.

28kB around 9:30PM MT.


coming up next - Twitch plays Semi-Random-Binary-Stream


What about the inverse.. Generate a semi-random-binary-stream from Twitch Plays $RANDOM?

Guaranteed randomness! /s


In FF, every click draws a weird selection rectangle around the button.


setInterval(function(){$(".zero.button").click()}, 10);


  var b = io.connect();
  setInterval(function() {"00001001".split('').forEach(function(n) { b.emit('input',n); });},10)
I found it interesting to see how hard I had to hit the server to a.) get all my requests in before another person's request gets in the middle - and b.) have my 8 bits (or however many I was sending, didn't have much luck past 40 bits) occur at the start of on ascii character - as opposed to having it be off by n bits that previously were on the stack.

Thankfully websockets/socket.io ensures ordering so I didn't have to worry about them becoming unordered like I would if it was using basic http requests.


setInterval(function(){Math.random() > .5 ? $(".zero.button").click() : $(".one.button").click()}, 80);

ヽ༼ຈل͜ຈ༽ノ


var s = io.connect(); var loop = function() { s.emit('input', '0'); setTimeout(loop, 0); }; loop();


I see mostly ones. Reminds me of http://dilbert.com/strip/2001-10-25


To follow current trends, we should call this "a curated, hand-picked random binary stream", no?


I can click the buttons, but the number of generated bits does not increase. It does not increase at all.


b = $(".zero");

for (i = 0; i < 100000; i++) {

    b.click();

}


please provide the user generated file for donwload


It will be done! Here is the roadmap for this projects:

* switch to Redis instead of MySQL

* keyboard input

* better rate limiting

* stats: 0/1 proportions

* stats: geolocalized data

* stats: hourly distribution (heatmap)

* entropy calculation

* API to get json data of latest n bits

* bits -> grey scale image generated in realtime ...


It breaks down rather ungracefully when loaded without JS on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: