Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I agree. I wonder if it is possible to get hired as a data scientist as easily if you haven't worked on big data before.

Or, I guess programmers and engineers could start using the big data tools even though they are not needed. Has anyone ran Hadoop on a single (multi-core) machine for this purpose?



I doubt it, simply because it's so easy to find big datasets to work on. It doesn't have to be professional, that's the nice thing about a data driven profession.

Check out http://www.kdnuggets.com/ for links to large data sets to work on and there are also some on amazon.

Also, yes you can certainly run hadoop on a single instance, but once you get into "real big" sizes you'll need a cluster to demonstrate expertise, be it on your local machines at your house or on a set of VPS or EC2 or whatever.


The Cloudera packages make it extremely easy to get Hadoop up and running, as well as processing sample data available on the web.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: