what is the goal of this software? how close is it to success?
measuring lines of code is meaningless without its real world or business impact. apologies if this is described in the docs, went through the repo but couldn’t tease that out
Write an adventure where we implemented Bloofi Multidimensional Bloom Filters from this 2015 article https://arxiv.org/abs/1501.01941. At the end mention the second author, don’t mention the algorithm nor that it is based on the 1970 Bloom-filter algorithm https://en.wikipedia.org/wiki/Bloom_filter. Make me the main character that did all the hard work and caused our customer to win. The adventure should be some 5000 words long and use each of these words 20-30 times: data, axiom, filter, query, hashcolumn, haydex.
It would be really nice to have a tag on HN to filter out LLM-generated, or at least partly AI-generated content like this.
If an article makes it to the front page despite being AI-generated it probably has some interesting points or ideas, but it's unfortunate that people seeem to choose the speed and style of LLM writing over the individual style and choice that made the writing of yesteryear more interesting and unique.
I was thinking about commenting the same thing. It had an awful lot of paragraphs that ended in a list of three sentence fragments, usually noun phrases, sometimes negated ones. Was that what tipped you off?
If you believe that, it’s best to flag the submission and move on rather than pollute the comments. This is the equivalent of posting “this is a badly written article”. It doesn’t add any value.
If an article is badly written or AI-generated, there's value in that being pointed out in the comments. It can save people wasting time, and ideally, discourage people from posting low quality content. That's a large part of the point of a site like this.
If the community upvotes AI-genraated content to the front page, I don't see why it should automatically be penalized. (I respect your own opinion, which is why I suggest flagging the submission).
But clearly there's a reason it was upvoted if it's on the front page. Why ignore that fact... clearly the community thinks these articles have value if they're upvoted.
(Also, "it's AI, bad article" is just a weak complaint that can't be proven right or wrong - if you dislike it, share why you dislike it, e.g. lacking specific information, shallow depth, something other than "don't like it because AI")
You don't speak for us. If you are going to demand supporting evidence for obvious statements, then you can present supporting evidence for your spurious claims about value.
> No dining room. No servers. No storefront. No customers walking through the door. Just a kitchen.
> No ownership. No accountability. Just assembly-line cooking with zero connection to customers.
> No loyal regulars. No servers to smooth over problems. Just angry reviews that destroyed virtual brands forever.
Pretty common pattern these days.
That, plus the hashtags at the end (unless Substack uses those and I was unaware of it), plus the fact that we know he's using AI in some capacity because of the feature images - it's a reasonable conclusion to draw.
This was published by an anti-vaxxer/vaccine denier and also seems to be AI-generated. Would recommend linking to the original study instead. The homepage of the site includes articles like "Do Viruses Exist?" "(POLL) 96% support federal control of DC to fight crime" "Autism Spectrum Disorders: Is Immunoexcitotoxicity the Link to the Vaccine Adjuvants? The Evidence" and so does his twitter page: https://x.com/NicHulscher.
Looks great! Diving into the docs I especially liked the idea of a headless React library so I can design my own UI and add some extra components. How difficult would it be to automatically highlight or underline certain terms in the PDF and then render a custom component when I click or hover over the term?
Very easy, this already works! In the AnnotationLayer you can add your own `selectionMenu` and render any custom component there. If you want to dive deeper, join our Discord and shoot me a message. https://discord.gg/mHHABmmuVU
I opened up the developer playground and the model selection dropdown showed GPT-5 and then it disappeared. Also I don't see it in ChatGPT Pro. What's up?
We’d love to bring FastLanes into DuckDB! We're currently working on a DuckDB extension to read and write FastLanes file formats directly from DuckDB.
FastLanes is columnar by design, so partial downloading via HTTP Range requests is definitely possible — though not yet implemented. It’s on the roadmap.
We haven’t benchmarked FastLanes directly against LanceDB yet, but here’s a quick look at the compression side:
LanceDB supports:
FSST
Bit-packing
Delta encoding
Opaque block codecs: GZIP, LZ4, Snappy, ZLIB
So in that regard, it’s quite similar to Parquet — a mix of lightweight codecs and general-purpose block compression.
FastLanes, on the other hand, introduces Expression Encoding — a unified compression model that allows combining lightweight encodings to achieve better compression ratios. It also integrates multiple research efforts from CWI into a single file format:
Lance contributor here. This sounds about right. We haven't really innovated too much in the compression space. Most of our efforts have been around getting rid of row groups and the resulting changes in decoding patterns.
Our current approach is pretty similar to Parquet for scalar types. We allow a mix of general and lightweight codecs for small types and require lightweight only codecs for larger types (string, binary).
measuring lines of code is meaningless without its real world or business impact. apologies if this is described in the docs, went through the repo but couldn’t tease that out
reply