We still need search engines to crawl the web and discover new content.
If you rely on an LLM as an all-knowing Oracle then you're dependent on what data it was trained, how often it gets updated, and whether the web content you were hoping to find, or some close approximation to it, is statistically regenerated by the LLM or not.
I think LLM-augmented search such as Bing or perplexity.ai is the way forward.
Perplexity.ai is interesting as it's very capable, and despite being based on one of the GPT-N models seems to generate output quite a bit different to Bing/chat.
In a sense traditional search is just a really, really dumb LLM. They both injest a bunch of text and build an "index" based on the text.
The traditional search model spits out links and short summaries, based on crude-ish matching of the search terms, whereas the LLM spits out actual answers (derived from the content corpus) based on sophisticated understanding of the query (but also, currently a big caveat, can also just lie to you!).
But everything the LLM needs to do traditional search is already in there. The main problem to be solved is just getting it to lie less; another problem is getting very frequent content updates, like search indexes do.
Even so, LLMs are already better than traditional search engines for many queries, as search engines qua search engines.
> In a sense traditional search is just a really, really dumb LLM. They both injest a bunch of text and build an "index" based on the text.
Not really .. the significant difference is that a search engine is backed by a web crawler that is running continuously discovering new content, vs the LLM which has a fixed training set that will only be updated infrequently (very slow and expensive to retrain). Also, once a new web page, or new version of a web page, has been indexed by the crawler then you should be able to find it if your search terms match, while with an LLM all bets are off as to whether you can coax it to generate something based on something in the training set.
Totally agree getting fast updates of new content is a problem that still needs to be solved. In terms of finding matches, I think LLMs are already better than Google, for content that is there.
(A super-traditional search that's just looking for raw text, like the HN search, could claim an advantage there, but something like Google or Bing is already far far off in the land of guessing intention rather than raw text matching, in my experience; in any case, 99% of the time raw-text matching is not what you want, anyway.)
Maybe time will prove me wrong, but my prediction is LLMs will improve on their weak-points and traditional search will mostly fade away in relevence.
If you rely on an LLM as an all-knowing Oracle then you're dependent on what data it was trained, how often it gets updated, and whether the web content you were hoping to find, or some close approximation to it, is statistically regenerated by the LLM or not.
I think LLM-augmented search such as Bing or perplexity.ai is the way forward.
Perplexity.ai is interesting as it's very capable, and despite being based on one of the GPT-N models seems to generate output quite a bit different to Bing/chat.