Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Most downloads of the entire Wikipedia per country (kiwix.org)
45 points by WithinReason on March 22, 2022 | hide | past | favorite | 8 comments


I'm doing my part o7

It's seriously a very interesting and useful dataset that you can do a lot of fun stuff with, if you grab one of the zims without pictures it's of very manageable size too of just a few dozen gigabytes compressed, and there are reasonably good library support in many languages.

Last point doesn't go for Java. Only one I could find for that was this <https://github.com/openzim/libzim>, it's antique and extremely poorly optimized and lacks support for newer compression schemes. I have fixed the performance and added support for zstd compression, but not published the code as it's extremely not finished and major features in the original codebase are very broken. I'll get around to sharing the code some day but right now it's basically permanently mid surgery as I've only patched so far as to get it to extract all or specific files. If anyone wants a copy of this code regardless of state, give me a holler.


Interesting that Russia is at almost 2x the next country (USA).


I'd do exactly this if I were worried about losing connectivity


Russia has already written to the Wikimedia Foundation demanding that they take down Russian Wikipedia's well-sourced and factual article on the cough special operation. Wikimedia said "lol no," of course.


There have been worries that Russia might soon ban Wikipedia, so people have been downloading it


They are getting a lot of traffic from the tjournal.ru domain looking at the stats. They seem to host a Russian article explaining how to download Wikipedia, expecting it soon to be blocked. You can see HTTP Referrals here: https://stats.kiwix.org/index.php?module=CoreHome&action=ind...


Curiously, that's the relationship between the first and second highest frequencies for the Zipfian distribution. However, third place and beyond are much smaller than they should be under that distribution.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: