Not here to schlep for AWS but S3 Vectors is hands down the SOTA here. That combined with a Bedrock Knowledge Base to handle Discovery/Rebalance tasks makes for the simplest implementation on the Market.
Once Bedrock KB backed by S3 Vectors is released from Beta it'll eat everybody's lunch.
S3 Vectors is great in terms of cost. But it provides around 500ms median query latency for 1M vectors, unlike other vector stores. And it does not support keyword search and sparse vectors. So I think it is better to choose which vector store to use based on your requirements.
Assuming that's what he meant, why would it be considered baseline versus anything else? I am genuinely curious because I'd like to know more about issues people face with this or that vector store in general.
People focus on the wrong issue so most quotes about evolution are highly misleading: the keyword should be about reproducing. Survival is almost irrelevant. Darwin awards in particular should never be given to anyone with kids (unless they kill their kids too).
"Most grandkids" is good but not catchy.
Or Idiocracy "evolution began to favor those who reproduced the most".
I agree to some extent but I don't think you can really separate the two. You have to survive long enough to reproduce enough. For almost all species, reproduction implies a non trivial amount of survival.
Edit: actually, "almost all species" is not right. Maybe "almost all interesting species"... which is admittedly too subjective a take.
That's exactly what it is - formalizing and creating a standard induces efficiency. Along with things like AGENTS.md, it's all about standardization.
What bugs me: if we're optimizing for LLM efficiency, we should use structured schemas like JSON. I understand the thinking about Markdown being a happy medium between human/computer understanding but Markdown is non-deterministic for parsing. Highly structured data would be more reliable for programmatic consumption while still being readable.
reply