a) Fair point. The justification is 'laziness'. The data is from 2010-10-05 04:00-05:00 GMT (late night PDT).
2). It uses a python script to save the twitter stream to json bz2 files, one per hour. And another script to analyze one bz2 to find the top values for any json field (in this case, 'source'). Pretty simple really.
3). Sure, that would be better.
4) Imperfect != Useless. I haven't seen any other recent data on twitter client usage, so I thought this may be interesting. For instance, I was surprised by how popular the website is, vs 3rd party clients.
If you need better/more results, contact me and maybe I can provide them commercially.
2). It uses a python script to save the twitter stream to json bz2 files, one per hour. And another script to analyze one bz2 to find the top values for any json field (in this case, 'source'). Pretty simple really.
3). Sure, that would be better.
4) Imperfect != Useless. I haven't seen any other recent data on twitter client usage, so I thought this may be interesting. For instance, I was surprised by how popular the website is, vs 3rd party clients.
If you need better/more results, contact me and maybe I can provide them commercially.
Always Be Closing,
rphlx