I think there's a very important nugget here unrelated to agents: Kagi as a search engine is a higher signal source
of information than Google page rank and ad sense funded model. Primarily because google as it is today includes a massive amount of noise and suffered from blowback/cross-contamination as more LLM generated content pollute information truth.
> We found many, many examples of benchmark tasks where the same model using Kagi Search as a backend outperformed other search engines, simply because Kagi Search either returned the relevant Wikipedia page higher, or because the other results were not polluting the model’s context window with more irrelevant data.
> This benchmark unwittingly showed us that Kagi Search is a better backend for LLM-based search than Google/Bing because we filter out the noise that confuses other models.
> Maybe if Google hears this they will finally lift a finger towards removing garbage from search results.
It's likely they can filter the results for their own agents, but will leave other results as they are. Half the issue with normal results are their ads - that's not going away.
>Maybe if Google hears this they will finally lift a finger towards removing garbage from search results.
Unlikely. There are very few people willing to pay for Kagi. The HN audience is not at all representative of the overall population.
Google can have really miserable search results and people will still use it. It's not enough to be as good as google, you have to be 30% better than google and still free in order to convert users.
I use Kagi and it's one of the few services I am OK with a reoccurring charge from because I trust the brand for whatever reason. Until they find a way to make it free, though, it can't replace google.
They are transparent about their growth of paying customers, do you feel as if this fairly consistent and linear rate of growth will never be enough to be meaningful?
> Primarily because google as it is today includes a massive amount of noise and suffered from blowback/cross-contamination as more LLM generated content pollute information truth.
I'm not convinced about this. If the strategy is "lets return wikipedia.org as the most relevant result", that's not sophisticated at all. Infact, it only worked for a very narrow subset of queries. If I search for 'top luggages for solo travel', I dont want to see wikipedia and I dont know how kagi will be any better.
The wrote "returned the relevant Wikipedia page higher" and not "wikipedia.org as the most relevant result" - that's an important distinction. There are many irrelevant Wikipedia pages.
Generally we do particularly better on product research queries [1] than other categories, because most poor review sites are full of trackers and other stuff we downrank.
However there aren't public benchmarks for us to brag about on product search, and frankly the simpleQA digression in this post made it long enough it was almost cut.
1. (Except hyper local search like local restaurants)
> We found many, many examples of benchmark tasks where the same model using Kagi Search as a backend outperformed other search engines, simply because Kagi Search either returned the relevant Wikipedia page higher, or because the other results were not polluting the model’s context window with more irrelevant data.
> This benchmark unwittingly showed us that Kagi Search is a better backend for LLM-based search than Google/Bing because we filter out the noise that confuses other models.