Agreed, but of course the massive write parallelism and fault tolerance of DBMS ...

dspillett · on Dec 11, 2011

> comes at the cost of dropping ACID

Do remember though that, as discussed here recently, most RDBMSs do not act in a fully ACID compliant way by default. IIRC it is providing a complete isolation guarantee often also provides a hefty performance hit so compromises are made in this area unless you explicitly tell it to be as careful as it can. I imagine this can cause quite a nightmare for master<->master replication.

There are a lot of people using "noSQL" options for the wrong reasons (such as to be buzzword compliant, or because they don't understand SQL), but there are issues that traditional RDBMSs have that stick-in-the-muds like me (who cringe at the phrase "eventual consistency" should be more aware of than we generally are.

Making the right choices about your data storage can be hard.

einhverfr · on Dec 12, 2011

That said, there are very good reasons not to use RDBMS in cases where the data model or specific access patterns just don't fit. But in most of those cases, I find that using in memory data structures combined with file system storage or something like BerkeleyDB is a much better fit than any server based DBMS.

I want to go into this just a little. The issue here is just that there are tradeoffs. Understanding these tradeoffs is what good design is about.

RDBMS's excel at one thing: presenting stored data for multiple purposes. This is what relational algebra gets you and while SQL is not relational algebra it is an attempt to bridge relational algebra with a programming language.

An RDBMS will never be as fast performance-wise as a basic object storage system. However, what it buys you is flexibility for that part of the data that needs to be presented for multiple uses.

So the sort of access pattern that doesn't fit is something like an LDAP server. Here you have a well-defined method for integrating the software, and the use cases for ad hoc reporting usually aren't there.

On the other hand, you have something like an ERP that allows file attachments to ERP objects. Even though the data model doesn't fit exactly, you can't do this gracefully with a NoSQL solution.

So I suggest people think about the need for flexibility in reporting. the more flexibility required, the more important an RDBMS becomes.

Additionally if you have multiple applications hitting the same database, an RDBMS is pretty hard to replace with any other possible solution.

6ren · on Dec 11, 2011

Codd emphasised the relational model as being able to change the underlying storage representation without breaking apps.

Will this eventually be a problem for NoSQL? Or, is the scalability worth the sacrifice?

Or, does NoSQL typically have only one (main) app, so making it work with a specific storage representation is not a big deal? The relational use-case was many different apps, with different versions, needing different access paths to the data. But if you just have one known set of access paths (like a REST URI), you can just design the DB for that. Hierarchical databases worked well when just one DB, one app; they just weren't very flexible.

Hierarchical databases are fast and simple but inflexible as the relationship is restricted to one-to-many, only allowing for one parent segment per child. http://it.toolbox.com/wiki/index.php/Hierarchical_Database

buu700 · on Dec 12, 2011

>Cassandra comes at the cost of dropping ACID

Suddenly I want to spend my next Saturday night setting up Cassandra at a party...

Edit: Come on, "dropping acid"! I thought it was funny...

einhverfr · on Dec 12, 2011

there's a project I have recently come across called Postgres-XC which would solve the problem you describe quite nicely. Basically it's Teradata-style clustering based on PostgreSQL. It isn't full-featured yet (version 0.9.6 is the current version, and it doesn't support things like windowing functions), but it looks extremely interesting in this space.