Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For a long time I assumed that google indexed pretty much everything it, and it was only a question of providing a specific enough set of search terms to drag up older content.

But what you hint at might be more correct these days. They are running a reverse wayback machine in that anything not changed in the last year gets removed. If you click the advanced search its "updated within" and the max timeframe is a year.

In fact it seems the date range example doesn't even work: https://developers.google.com/custom-search/docs/structured_...

If I fiddle with it, it returns a result, but I see an hit from just a few days ago at the top...



> They are running a reverse wayback machine in that anything not changed in the last year gets removed.

Sometimes I wish that were true! Try Googling for, say, PostgreSQL documentation and the top result will often be for a 10-year-old version of the software.


Nitpick, that's kind of a non example because the official Postgres docs let you swap versions more or less seamlessly. IME I will click the top result then click the correct version for my Postgres SE queries. 2 clicks, 0 scrolling.


You're right. I tend to click back, then revise my search to include the version, and then click the result, so Google gets the message that when I search for Postgres docs, I want the most recent version. I have no idea if this actually works, but I heard Google uses bounces to determine relevancy, so I thought it was worth a shot.


Might as well piss in the wind. The number of back-links from different sources probably has a much larger effect.


Sure, but 1 click and 0 scrolling would be better!


Try the button I'm feeling lucky!


Before the omnibar became popular I used to use google as my homepage and I'd type website keywords then "I'm feeling lucky" to get to my frequent websites. I think bookmarks took up too much space vertically so this was my solution /shrug


Well it might be easy to switch but it is a good example

Why is it that Google is thinking the older page is more relevant? Does PageRank outrank content (and Google is oblivious to similar pages that have different versions?)


One would expect that with their resources they could figure out for which topics recency matters and for which it doesn't.


that's why I always filter by last year


> For a long time I assumed that google indexed pretty much everything it,

They did that for a long time, but some years ago the index grew so big, they started restricting it. I thing the general timeframe is 10 years or less till the last update.

> If you click the advanced search its "updated within" and the max timeframe is a year.

Because it makes no sense to go further. For older content you can define individual dateranges. And yes, it works fine for me. Tested a search for 2015 just now, first side had entries all from 2015.

> In fact it seems the date range example doesn't even work: https://developers.google.com/custom-search/docs/structured_....

All those examples are not working. Wasn't custom search retired some years ago?


This is really interesting, I never thought of it this way. Thanks for sharing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: