I was wondering about this. I was hesitant to add embedding-based search to my a...

jasonjmcghee · 2025-05-12T16:55:18 1747068918

checkout MTEB (https://huggingface.co/spaces/mteb/leaderboard) many of the open source ones are actually _better_.

I've had a particularly good experiences with nomic, bge, gte, and all-MiniLM-L6-v2. All are hundreds of MB (except all-minilm which is like 87MB)

simonw · 2025-05-12T17:26:46 1747070806

I love all-MiniLM-L6-v2 - 87MB is tiny enough that you could just load it into RAM in a web application process on a small VM. From my experiments with it the results are Good Enough for a lot of purposes. https://simonwillison.net/2023/Sep/4/llm-embeddings/#embeddi...

kaycebasques · 2025-05-13T01:02:47 1747098167

87MB is still quite big, though. Think of all the comments here on HN where people were appalled at a certain site loading 10-50 MB of images. Hopefully browser vendors will figure out a secure way to download a model once and re-use that single model on any website that requests it. Rather than potentially downloading a separate instance of all-MiniLM-L6-v2 for each site. I know that Chrome has an AI initiative but I didn't see any docs about this particular problem: https://developer.chrome.com/docs/ai

jasonjmcghee · 2025-05-13T01:47:25 1747100845

It's crazy because chrome ships an embedding model, it's just not accessible to users / developers (afaik)

https://dejan.ai/blog/chromes-new-embedding-model/

Ey7NFZ3P0nzAe · 2025-05-13T08:02:15 1747123335

Personaly I hate it because it has a very short context length and *silently* crops the text after a tweet like text size. Inve been on a crusade about this on github and nobody seems to know this.

My go to right now is on ollama: snowflake-arctic-embed2