I'm curious, in your benchmark, what's the difference between BM25+Embedding and...

s-macke · on June 4, 2024

BM25+Embedding and Embedding+BM25 is exactly the same and shows the commutative relation whether you start from keyword search or semantic search.

For my tests, I used Ada-002. As data I used small news articles and no chunking and no preprocessing. The query for the articles is embedded directly.

Of course, improvements can be done for both approaches. That should just exemplify, what you might expect with hybrid search.