Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This will get you penalized for having a website that takes forever to load. This is what happens:

Googlebot requests page -> your webapp detects googlebot -> you call remote service and request that they crawl your website -> they request the page from you -> you return the regular page, with js that modifies it's look and feel -> the remote service returns the final html and css to your webapp -> your webapp returns the final html and css to Googlebot. That's gonna be just murder on your loadtimes.

If this must be done, for static pages, it should be done by grunt during build time, not by a remote service. For dynamic content, it's best to do the phantomjs rendering locally, and on an hourly (or so) schedule, since it doesn't really matter if googlebot has the latest version of your content.

Or perhaps I'm mistaken and the node-module actually calls the service hourly or so and caches results on app so it doesn't actually call the service during googlebot crawls. If that's the case, I take back my objections, but I'd recommend updating the website to say as much.



If it doesn't cache, then besides latency, someone could send fake googlebot requests and overload the prerender service, which is unlikely to be able to handle a lot of traffic.


Pretty sure the load time problem can be mitigated by caching.


Best case scenario you still have network trips going out to the service, so it's still not a great solution UNLESS the caching was done by your webapp - which is what I spoke about at the end of my comment above.

Unless this works w/o adding network roundtrips on each request, it's not a great idea.


I think the Unix philosophy of "do one thing and do it well" applies here. There are already off-the-shelf caching solutions that do what you describe: for example, with Varnish you can serve cached pages immediately and update the cache contents in the background.

It would probably be better to use those than reimplement them in an uber-webapp.


Its not a remote service. Its PhantomJS which is webkit rendering on your own server. Where did they say it was going to call a remote service?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: