Author here! I love basically everything Rich Hickey does (which is probably not a sign that he is always right, but that he's so much smarter than me that he can convince me of anything). Simple Made Easy was so influential on my early career (which I am still in) that I probably am far too suspicious of all "easy" pathways.
However, Derek was no Rich Hickey. Such deviations from the norm should be left to the type of person that can create Clojure, and very far away from the people that upload PII to GitHub. If we could create a culture where we slapped people's hands on instinct when they reached for MongoDB, I think we'd be better off.
I'm not arguing against that, and I hoped to make that clear with the first sentence of my comment.
The post felt like a rant against triple normalization in general, and not that particular bastardization of it, so I tried to give some counter examples.
The PII thing is on an entirely different page, and indeed sweat inducing :D
Oh for sure, I did read your first paragraph that way. I think what I was trying to communicate (badly) is the message that I want to ingrain in engineering culture writ large - don't let anyone do this unless they're Rich Hickey.
That's calibrated too far the other way but there'd be less damage.
There is a difference between abusing a relational database versus using something that is key-value from the ground up.
Notice how in the example you have:
1 Name Ludic
1 Age 29
1 Profession Tortured Soul
The key is not unique. There is no primary key. So to hunt down all the properties of User 1, you have to do a query for all the records whose Key is 1.
I think that doesn't happen in keyword-value stores. Your key 1 has to be unique: it retrieves one blob, and that's it. You have to stuff all the properties into that blob somehow: for instance, by treating it as an array of the keys of other blobs.
Derek could have used multiple tables. That still gives you all the flexibility. Just that if someone wants to invent a new property, they have to add a table.
Then we have
Name table: Age table: Profession table:
1 Ludic 1 29 1 Tortured Soul
The keys are unique: we fetch key 1 from each table, and we have the three properties. Not great compared to fetching one row with three fields, but better than stuffing everything as rows into one table.
> I think that doesn't happen in keyword-value stores.
It absolutely does happen in triple stores, though, where data is commonly stored in subject-predicate-object form, and the subject's identifier is certainly meant to be unique to that subject, but not per triple.
> The key is not unique. There is no primary key. So to hunt down all the properties of User 1, you have to do a query for all the records whose Key is 1.
In this example, the primary key would be ([User ID],[Key]). The primary key does not need to be a single field.
It's not a bad thing on SSD storage maybe. On spinning platter hard drives, we may have to suffer a head seek time to get each field of the object from a separate record in the separate table. Whereas the fields of a record can be stored close together.
“Future users of large data banks must be
protected from having to know how the data is
organized in the machine (the internal
representation)… Activities of users at
terminals and most application programs
should remain unaffected when the internal
representation of data is changed and even
when some aspects of the external
representation are changed…”
Roll the related data up into materialized views for read performance.
Datomic is a tragically unoptimizable black box, one that spills candied joys when you start using it, gradually transitioning to the unspeakable horrors of Pynchon's "The Disgusting English Candy Drill".
IMO the real tragedy is that Datomic predated the open-source-but-you-can't-host-it-or-otherwise-operate-it-for-profit licenses that have proliferated recently.
If I ever get the time, I'll write a compiler. And if I have time left over on sabbatical, I'll reimplement Datomic as a Postgres extension (goddamnit).
Datomic is in the same group, and I'd consider Rich Hickey to be one of the best programmers there are.