More

setr · 2025-12-23T01:48:18 1766454498

The SQL standard defines more of an aesthetic than an actual language. Every database just extends it arbitrarily and anything beyond rudimentary queries is borderline guaranteed to be incompatible with other databases.

When it comes to procedural logic in particular… you have almost zero chance you’re dropping into that into another database and it working — even for rudimentary usage.

SQL-land is utterly revolting if you have any belief in standards being important. Voting for Oracle (itself initialized as a shallowly copied dialect of IBM SQL, and deviated arbitrarily) as the thing to call “standard” is just offensive.

chasil · 2025-12-23T22:41:32 1766529692

I was not aware that IBM copied Ada.

I was aware that EnterpriseDB developed "deep Oracle compatibility" and sold the resulting code to IBM for Db2 several years ago.

I think you are [more than] a bit behind the times?

https://www.cnet.com/culture/ibm-puts-oracle-to-the-sword-wi...

setr · 2025-12-23T01:39:44 1766453984

From-before-select has nothing to do with composition as far as I can think of? That’s to solve the auto-complete issue — put the table first so the column list can be filtered.

Things like allowing repeat clauses, compute select before where, etc are what solve for composition issues

setr · 2025-12-18T08:21:46 1766046106

That page has a reasonable re-creation, with trivial usage at call-sites, of each missing feature though? The only one that looks a bit revolting is the large pipe example

setr · 2025-12-14T21:43:59 1765748639

I’m not clear on how you’re deviating from a normal columnar/OLAP database?

> I found that these columnar stores could also be used to create regular relational database tables.

Doesn’t every columnar store do this? Redshift, IQ, Snowflake, ClickHouse, DuckDB etc

> but it proves that it is possible to structure relational data such that query speeds can be optimal without needing separate indexing structures that have to be maintained.

Doesn’t every columnar database already prove this?

didgetmaster · 2025-12-14T23:47:12 1765756032

I am not an expert on all the other columnar stores out there; but it is my understanding that they are used almost exclusively for OLAP workloads. By 'regular database tables', I meant those that handle transaction processing (inserts, updates, deletes) along with queries.

My system does analytics well, but it is also very fast with changing data.

I also think that some of those systems (e.g. Duckdb) also use indexes.

setr · 2025-12-15T00:31:38 1765758698

They’re used by OLAP workloads because columnar properties fits better — namely, storing data column-wise obviously makes row-wise operations more expensive, and column-wise operations cheaper; this usually corresponds to point look-ups vs aggregations. Which cascades into things like constraint-maintenance being more expensive, row-level triggers becoming a psychotic pattern, etc. Column-wise (de-)compression also doubles-down on this.

They still do all the regular CRUD operations and maintain transactional semantics; they just naturally prefer bulk operations.

Redshift is the most pure take on this I’ve seen; to the point that they simply don’t support most constraints, triggers and data is allocated in 2MB immutable chunks such that non-bulk-operations undergo ridiculous amounts of write amplification and slow to a crawl. Afaik other OLAP databases are not this extreme, and support reasonable throughput on point-operations (and triggers, constraints, etc) — in the sense that it’s definitely slower, but not comically slower. (Aside: Aurora is also a pure take on transactional workloads, such that bulk aggregations are comically slow)

> I also think that some of those systems (e.g. Duckdb) also use indexes.

I’m pretty sure they all use indexes, in the same fashion I expect you to (I’m guessing your system doesn’t do table-scans for every single query). Columnar databases just get indexes like zone-maps for “free”, in the sense that it can simply be applied on top of the actual dataset without having to maintain a separate copy of the data ALA row-wise databases do. So it’s an implicit index automatically generated on every column — not user-maintained or specified. I expect your system does exactly the same (because it would be unreasonable not to)

> My system does analytics well, but it is also very fast with changing data.

Talk more, please & thank you. I expect everything above to be inherent properties/outcomes of the data layout so I’m quite curious what you’ve done

didgetmaster · 2025-12-15T18:20:11 1765822811

Several of your assumptions are correct.

My project Didgets (short for Data Widgets), started out as a file system replacement. I wanted to create an object store that would store traditional file data, but also make file searches much faster and more powerful than other file systems allow; especially on systems with hundreds of millions of files on them. To enhance this, I wanted to be able to attach contextual tags to each Didget that would make searches much more meaningful without needing to analyze file content during the search.

To facilitate the file operations, I needed data structures to support them. I decided that these data structures (used/free bitmaps, file records, tags, etc.) should be stored and managed within other Didgets that had special handling. Each tag was basically a key-value pair that mapped the Didget ID (key) to a string, number, or other data type (value).

Rather than rely on some external process like Redis to handle tags, I decided to build my own. Each defined tag has a data type and all values for that tag are stored together (like column values in a columnar store). I split the tag handling into two distinct pieces. All the values are deduplicated and reference counted and stored within a 'Values Didget'. The keys (along with pointers to the values) are stored within a "Links Didget'.

This makes analytic functions fast (each unique value is stored only once) and allows for various mapping strategies (one-to-one, one-to-many, many-to-one, or many-to-many). The values and the links are stored within individual blocks that are arranged using hashes and other meta-data constraints. For any given query, usually only a small number of blocks need to be inspected.

I expected analytic operations to be very fast, like with other OLAP systems; but I was pleasantly surprised at how fast I could make traditional OLTP operations run on it.

I have some short demo videos that show not only what it can do, but also benchmark many operations against other databases. Links to the videos are in my user profile.

setr · 2025-12-09T19:50:55 1765309855

Afterwards, you can run into a theater and yell “fire!”

setr · 2025-12-09T06:23:52 1765261432

Because getting any hardware out of infra-team on premise is utterly miserable, across the board.

lelanthran · 2025-12-13T17:50:54 1765648254

That's not the only alternative.

Rent your VPS and add in extra volumes for like $10 per 100GB.

Imustaskforhelp · 2025-12-13T19:48:44 1765655324

Funny thing but netcup has $10 per 1 TB

Netcup is under-rated but there are also other providers too at lowendbox/lowendtalk and I am interested to try out hetzner too sometime.

benjiro · 2025-12-13T20:01:01 1765656061

And if you want to go even cheaper, check out Hetzner their EX63 (go to custom) > 4x 7.68TB drives for like 140 Euro.

Not counting the fact that Netcup is raided (also Netcup is limited to 8TB on a VPS).

That is like 4.7 Euro /TB. That is like 4$/TB. 6 Euro / TB in a raid 5 setup.

I do not understand why they are not using this new pricing model on their older servers. There the best you can get is like 10 Euro /TB (for the single 15TB U.2).

lelanthran · 2025-12-14T01:37:05 1765676225

> Funny thing but netcup has $10 per 1 TB

Nice to know, but I was just guessing at what a reasonable price would be :-)

setr · 2025-12-09T06:12:44 1765260764

I always just used it to confirm your last action on a POST —> GET sequence. Eg confirming that your save went through/rejected (the error itself embedded & persisted in the actual page). Or especially if saving doesn’t trigger a refresh so success would be otherwise silent (and thus indistinguishable from failing to click).

You could have the button do some fancy transformation into a save button but I prefer the core page being relatively static (and I really don’t like buttons having state).

It’s the only reasonable scenario for toasts that I can think of though.

setr · 2025-12-08T15:28:52 1765207732

It’s a tax. You could describe any beneficiaries of a tax in the same manner; we’re paying taxes to at least partially cover group X - homeless, scientists, military, retirees, veterans, etc.

There’s no debt being paid; money is simply taken from Peter, and money is simply given to Paul.

It’s not a retirement program, it’s retirement subsidization.

salawat · 2025-12-08T20:05:07 1765224307

I don't think I'm willing to grant you Social Security as a proper "tax" or "subsidy" unless you're going to pitch me that Social Security is really, in essence, an incentive program for unrestrained natalism to keep population above replacement with all the Manifest Destiny/imperialistic implications and aspirations that come with it, and further, a commitment by the people who started it to never under any circumstances inform descendants of it's true nature.

If you are willing to concede the above, I'll reclassify it as a proper "subsidy" insomuch as it was a law that was passed, and it is a clear act by the government to incentivize activity "X". At which point my discussion will quickly turn to "Holy shit, why are we still trying to empire build in the year of our Lord 2025? Shouldn't we have changed this by now?"

If not... Still seeing it as a Ponzi. A fundamentally degenerate and unstable financial model, intended only to benefit the people who have been in it the longest solely for the purpose of self-enrichment. Well branded, mind; who doesn't want Social Security? But a Ponzi in essence nevertheless.

setr · 2025-12-02T04:39:32 1764650372

Given that they made no apparent use of such information in practice, the unfortunate thing is that they had the idea to begin with.

eitland · 2025-12-02T09:41:18 1764668478

This is a problem all over our industry:

- almost every search field (when an end user modifies the search instead of clicking one of the results for the second time that should be a clear signal that something is off.)

- almost every chat bot (Don't even get me started about the Fisher Price toy level of interactions provided by most of them. And worse: I know they can be great, one company I interact with now has a great one and a previous company I worked for had another great one. It just seems people throw chatbots at the page like it is a checkbox that needs to be checked.)

- almost all server logs (what pages are people linking to that now return 404?)

- referer headers (you product is being discussed in an open forum and no one cares to even read it?)

We collect so much data and then we don't use it for anything that could actually delight our users. Either it is thrown away or worse it is fed back into "targeted" advertising that besides being an ugly idea also seems to be a stupid idea in many cases: years go by betweeen each time I see a "targeted" ad that actually makes me want to buy something, much less actually buy something.

setr · 2025-11-30T20:48:49 1764535729

I don’t know what kind of psychopath would provide a problem with the expectation that you already know the problem by heart