Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sure -- you can level multiple complaints.

What about the failed pages? How about shoving those on a queue and retrying n times with an exponential backoff between. What about the total number of failed pages? What about failed pages by site? etc etc etc

But so what -- the principle is still sound. All I described is still a 100 line python script, written in an afternoon, instead of 3 weeks of working bringing up emr, installing and configuring nutch, figuring out network issues around emr nodes talking to commodity internet, installing a persistent queue, performing remote debugging, building a task dag in either code or (god help you) oozie/xml, and on and on.



Anybody can throw some crap together and make it stick. And it's a perfectly valid solution.

My issue is when there is criticism laid against those solutions which are actually engineered in a way that allows for supportability and extensibility. They are arguably far more important than execution time.


I can't figure out why you're arguing against standard unix tools/idioms in the name of supportability and extensibility? It defies logic.


I think, in many peoples minds, extensibility == pain; either lots of code configuration (hello java, ejb), or xml (hello hadoop, java, spring, ejb), or tons of code (hello java, c++), etc. When nice languages don't make things painful, it sometimes feels like it's wrong, or not really enough work, or in some other way, insufficient. But people can mistake the rituals of programming for getting actual work accomplished.

:shrug: just .02


I imagine what would have happened to Linux if Linus designed it with supportability and extensibility in mind.


Simple: because std utils are programs that do what they supposed to do. if problems bound are well within the definition domain of a std util then its all good. Supportability and Extensibility is way too generic for you to draw a line saying std utils can handle them all. After all, they are programs, not programming languages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: