This purely about database language semantics, not other aspects of database systems. In other words, it's a manifesto about languages for databases, and a criticism of SQL in particular.
The challenges of a database language are different than those of a general purpose language. Queries live a long time, spread over all kinds of applications (and different versions) and reports and scripts. And the data is always growing and changing. Constraints and declarative queries help bridge the concerns of the readers and writers of the data.
I read quite a few books by Date & Darwen, in particular, Temporal Databases and the Relational Model. Out of that came my work on Exclusion Constraints and Range Types for PostgreSQL, which allow you to deal with ranges of time (with beginning and end) much more effectively. So I believe the authors have good insights.
But they are definitely in an ivory tower. They make strong claims about superiority of one approach over another without connecting with practice. You can feel their frustration that their ideas aren't put into practice, but they underestimate the difficulty of doing so. And I suspect that to bridge that gap enough to see a net benefit over SQL would require some revisions and amendments to their theories.
I believe haskell has some really nice answers to problems in database languages, and that database queries would be a great application area for a haskell-like language. It's too bad traditional PL theory people and DB people don't mix a little more.
Hello, author of a Haskell library for Postgres[1] here!
I agree with you. Darwen and Date have some very weird requirements, notably:
1. Requirement that "relations" should be sets not bags. Having one means you automatically have the other[2]. There's no point going out of one's way to forbid bags.
2. Requirement that the contents of "relations" should be "tuples". The contents of a "relation" should be allowed to be any data type, whether compound or not.
3. Completely ignoring sum types.
4. Forbidding NULLs even though there's an easy way to permit them sensibly: have nullability be part of the type.
Overall, Darwen and Date's language design is poor from both the point of compatibility with real-world implementations and with current practice in programming language theory.
I think my package Opaleye is a good candidate for what Darwen and Date's proposed "D" language should have been.
Date & Darwen didn't actually design a language, just a set of prescriptions and proscriptions.
I agree with you on points #3 and #4. Of course they'd argue that NULLs and sum types are possibly conforming, but to not discuss them at all is a major omission.
They do barely hint at the idea of sum types, in An Introduction to Database Systems, 18.6, "Special Values". That's 3 paragraphs, with no examples, and a reference to another work. I already have enough of their books and didn't bother to get that one, in part because the three paragraphs describe the approach as "not very elegant". It was also published in 1998, well after ML-like type systems were in practice, so it's very strange that Date & Darwen tried to go about it on their own.
Regarding your points #1 & #2, I'm not sure I agree. When it comes to computing, any type can be used to represent any other type as long as the domains are equally large. But that's not a very useful discussion when we're talking about languages. Changing from bags to sets or vice versa changes the kinds of operations available and their meaning. If we're talking about the relational model, then let's use relations and tuples. Maybe you don't want to always use relations, and if the system has good support for bags, then great, but it's not quite the relational model and may lose some of the benefits.
Regarding point #2 specifically, you may find Appendix B in The Third Manifesto interesting (page 401 in the linked PDF).
Overall, I think they had some good insights, and they influenced the way I look at databases in a positive way. I also had a very positive experience the one time I interacted with Hugh Darwen, where he showed some genuine curiosity about my question and took it seriously. I'm glad I followed their work, but I'm also glad I didn't join them in their ivory tower.
Regarding #1, sets and bags are both widely used in practice so restricting to just one or the other seems unnecessarily limiting. If there are benefits to restricting oneself to sets then one can restrict oneself to sets and get those benefits. The only possible drawback I can see is making the API surface area too large and unwieldy. But I don't think that applies here.
Regarding #2, that's interesting, they've done the same sort of "if you have one you have the other" analysis that I did for sets vs. bags, and concluded that you only need the more general one. I have to say I don't follow their rationale. It still seems to me that relations should have a single field. If you want to simulate "multiple fields" then make the "single field" a tuple. That's what Opaleye does and it works very naturally.
As an aside, thanks for your work on Postgres. It's a great system!
> It's too bad traditional PL theory people and DB people don't mix a little more.
The only ones who do mix seem to be the ones who work with type theory or category theory! I predict that we’ll see an absolute revolution in about 20 years, after successive cohorts of undergrads are exposed to this stuff and slowly carry it out into academia and industry.
The challenges of a database language are different than those of a general purpose language. Queries live a long time, spread over all kinds of applications (and different versions) and reports and scripts. And the data is always growing and changing. Constraints and declarative queries help bridge the concerns of the readers and writers of the data.
I read quite a few books by Date & Darwen, in particular, Temporal Databases and the Relational Model. Out of that came my work on Exclusion Constraints and Range Types for PostgreSQL, which allow you to deal with ranges of time (with beginning and end) much more effectively. So I believe the authors have good insights.
But they are definitely in an ivory tower. They make strong claims about superiority of one approach over another without connecting with practice. You can feel their frustration that their ideas aren't put into practice, but they underestimate the difficulty of doing so. And I suspect that to bridge that gap enough to see a net benefit over SQL would require some revisions and amendments to their theories.
I believe haskell has some really nice answers to problems in database languages, and that database queries would be a great application area for a haskell-like language. It's too bad traditional PL theory people and DB people don't mix a little more.