That might be the coolest feature I've ever heard of. I want to play around with Sphinx now just to use that. For those that know SQL, SphinxQL should make getting up and running rather painless.
We have a customized sphinxsearch engine powering bug.gd. We can't say enough good things about it. Actually, out of all the open source stack we use, I think I'm most pleased with how robust and solid that piece is specifically. Sphinx is one of those powerful little open source gems that people don't discuss enough. It feels very feature rich and is wicked fast. I don't understand why it gets so little attention overall.
Its reindexing speed is especially good-- I highly recommend it for any service that needs their search index updated quickly. A reindex of 100,000 documents takes about a second-- just fast enough that we could actually do a full reindex on every search if we were that silly.
Supposedly you'll also find it behind craigslist's new search backend and (supposedly) the new whitehouse.gov.
The selfish inkling in me doesn't like talking about Sphinx because it feels like one of those "best kept secrets" that you'd just want to keep to enjoy for yourself-- of course that's silly.
We use it on MightyBrand. It's fucking amazing, at least for us. For an example, indexing a couple million items (blog posts, tweets, digg stories...several gigs of data) takes < 1 minute in most cases, on my Macbook. Full text searches against that data are typically a couple of milliseconds. Getting it setup took all of an hour. Definitely worth installing and playing around with.
Same thing here, if you do search on your website you should have a look a sphinx. Just because it's so insanely fast (even if you don't do full text search). Once you start using it you'll never look back ;).
It scales relatively well, though I wish we were getting sub-minute indexing, I'm pretty pleased with the 45 minute timespan we're seeing on a pretty large dataset.
Responses are quick, though I'm having a some problems with memory usage on my MBP, I've mostly fixed that with careful configuration.
Biggest problem is that django-sphinx is a piece of trash. I had to hack together a custom wrapper that we might consider open sourcing at some point.
We looked at Sphinx but ultimately decided to use Solr/Lucene instead. There were a couple reasons:
* We're not doing fulltext search, but searching by a set of tags that we didn't want tokenized/stemmed. Doing this with Lucene was easy, while trying to get Sphinx to work this way really seemed like it was going against the grain.
* The incremental update support seemed a whole lot simpler in Lucene than it was in Sphinx. The exact same stuff might be going on under the hood, but adding new documents seems to "just work" in Solr with less worrying about the implementation details.
From everything I've seen, Sphinx's performance is excellent and it's very well-suited to fulltext search. It just seems a little less flexible than Lucene.
Performance wise they seem pretty similar (althoug I couldn't find any up to date benchmarks), feature wise it depends on what you are looking for. Sphinx has aggregate support and lucene can update the index (sphinx can "only" merge deltas).
If you don't use java already I would always go for sphinx, java adds allot of dependencies/"things you have to take care of".
One neat feature added in this release is that you can connect to sphinx using any mysql client. So there is no need for a specialized client api and you can use it with everything (literally).
That might be the coolest feature I've ever heard of. I want to play around with Sphinx now just to use that. For those that know SQL, SphinxQL should make getting up and running rather painless.