More

jamesgresql · 2026-06-09T00:06:09 1780963569

ParadeDB built a benchmark runner to make experimentation easy. This post looks at a real example over three iterations (with wildly different results).

jamesgresql · 2026-05-12T23:07:55 1778627275

We (ParadeDB) recently started building out our benchmarking infrastructure for cross-backend comparisons.

Rather than taking the usual path of bundling a workload and execution into one neat package, we decided to build a reusable database benchmark runner based on grafana/k6 first.

jamesgresql · 2026-04-11T22:21:47 1775946107

This is great, no more lost terminal screens!

olleeolleeollee · 2026-04-12T03:17:52 1775963872

Thanks! Let me know if there’s anything I can do to make it more useful. I thought in the next update I’d work on custom key bindings but if there’s anything more glaring, I’d love to hear it.

jamesgresql · 2026-04-07T11:43:28 1775562208

"If I had looked at the lexical search and BM25 space in 2016, I would have said it was solved, and that catching up would be nearly impossible."

This interview with Tantivy creator Paul Masurel looks at how wrong I would have been; discussing challenging solved domains, open-source competition done right, and why long-fermented frustration is an underrated driver.

jamesgresql · 2026-02-26T02:47:52 1772074072

I have no affiliation with this product other than being a happy user, but man is it good for finding out exactly why and when your wifi is slow.

Best feature for me being was being able to detect intermittent jitter to my gateway. I never managed to catch this with speed-tests alone.

johng · 2026-02-26T03:07:08 1772075228

Ahhh no demo or trial. I want to support animals and dogs but I don't want to shell out $10 without giving it a try first.

jamesgresql · 2026-02-26T19:41:50 1772134910

Yeah I get that, I threw caution to the wind and did it (which I would normally never dO)

jamesgresql · 2026-02-17T17:09:29 1771348169

This is a no-nonsense walkthrough of doing hybrid search inside Postgres without spinning up a separate search service.

A few takeaway: - Postgres’s native `tsvector/ts_rank` stuff works ok for basic text matching, but it doesn’t account for global term frequency like BM25 does , so rankings can feel “flat” or noisy as soon as you go beyond simple queries (it's also slow). - Using a BM25 index (via extensions like `pg_search`) actually gives you relevance scores similar to what you’d expect out of modern search engines, and you can use stemmers/tokenization directly in SQL. BM25 is the star of this story. - Vector search fills in the semantic gaps (so “database optimization” isn’t limited to exact keywords), but you still don’t want to throw out lexical relevance. The trick is making it additive, not just adding scores together. - RRF (Reciprocal Rank Fusion) is a neat practical tool here. It sidesteps trying to normalize totally different scoring systems by just focusing on rank positions.

If you’re building anything where relevance matters (docs, product search, help articles) having BM25 + vector makes a big difference over vanilla FTS + embeddings alone. It also keeps everything in Postgres, which simplifies consistency/ops compared to an external search cluster.

jamesgresql · 2026-01-11T20:54:47 1768164887

I know it sounds obvious, but some people are pretty determined to us it that way!

jamesgresql · 2025-12-14T19:39:24 1765741164

Hey HN! Author here. We added faceted search capabilities to our `pg_search` extension for Postgres, which is built on Tantivy (Rust's answer to Lucene). This brings Elasticsearch-style faceting directly into Postgres with a 14x performance improvement over a CTE based approach by performing facet aggregations in a single BM25 index pass and making use of our columnar store.

You get the same faceting features you'd expect from a dedicated search engine while maintaining full ACID compliance. Happy to answer technical questions about the implementation!

PSeitz · 2025-12-15T02:51:39 1765767099

Hi, tantivy dev here. There are two recent performance improvements in tantivy, which should make term aggregations considerable faster.

https://github.com/quickwit-oss/tantivy/pull/2740 https://github.com/quickwit-oss/tantivy/pull/2759

stuhood · 2025-12-15T15:34:22 1765812862

Yes, thank you for your hard work! We rebased recently, and we'll likely talk about those improvements as part of our `0.21.x` release.

jamesgresql · 2025-12-12T18:49:19 1765565359

Haha, I like “good old tokenization”

jamesgresql · 2025-12-12T18:31:43 1765564303

Amazing, will have a read!