Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks for making this open source. What about the index? Are you open sourcing that also?


If I can solve the logistics of publishing that data, then sure. In its most compressed form it's still of order 100 Gb.

The intermediate goal is to have some standardized testing dataset of a couple of hundred megabytes to a gigabyte or so.


Like another commenter suggested, torrents might be a good solution once it's seeded


Cool. Looking forward to see the intermediate dataset.

I think you should post a ToDo list on the git repo. People can then contribute their skills.


Yeah, that's a good idea. I'm looking at a bunch of ideas for reducing the friction to contributing, still a bit of work that needs doing in that area.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: