Ugh. I can't believe I'm engaging in this, but there really isn't any "debate" on this issue. At least not between people who know what the hell they are talking about.
Most of the solutions people are talking about when they mention NoSQL are not replacements for SQL databases. They are completely different beasts. I suppose if we're going to use strained metaphors I'd say SQL vs. NoSQL is akin to diesel vs. a V12. That is they are both fine technologies and, except for the very loosest of definitions, they do completely different things.
If you think NoSQL and SQL are actually battling, you need to do more reading...
You are completely and totally right in terms of technology. They target related, but separate and distinct, use cases. One company can quite reasonably and consistently use both for different purposes, even within the same project.
However, many people, even ones that should know better, do not see it that way. A lot of Relational DBAs, especially the ones that just see it as a way to make a living and have no real passion for it, see NoSQL as a threat to their jobs and want it to die. On the flip side, many NoSQL advocates really are championing it as replacements for a relational database for almost all use cases.
There should not be a debate, but there really is one going on.
Personally, I'm drawn to Mongo, etc. because I can develop in my language of choice, and not have to write/debug SQL of any dialect to store and retrieve my data.
Personally, I'm drawn to Mongo, etc. because I can develop in my language of choice, and not have to write/debug SQL of any dialect to store and retrieve my data.
An ORM with a relational database would let you do the same thing.
SQLAlchemy in Python is reasonably good for a project nearing a 1.0 release and Linq works nicely with C#. Virtually every other major language has at least one ORM available to it.
That is not to say that a relational database with an ORM is the way for you to go at all, just that avoiding writing SQL code it probably not the best reason by itself to choose your database platform.
An ORM is an abstraction layer. Abstraction layers leak. Fancy abstraction layers like Hibernate add at least as much complexity themselves as learning raw SQL afresh.
I've learned my lesson - if I'm working over SQL, I'll use SQL. Either via the direct record oriented API of the language, or using a thin abstraction like iBATIS that wraps the inputs and outputs into convenient data-carriers.
MongoDB is different. The "document" is real to the database. When you're saying something like "add a foo to the list of foos of record X", that is pretty close to the low level action that actually gets performed on disk. There is no abstraction-translation step to get out of whack.
I've used sqlalchemy, i've used django, and here's the thing, _every_ time I've used them, I've needed to know and understand SQL to learn/write and debug them.
Eventually, some bug comes up, and you need to peek at what queries the ORM is generating, and then peek at what is actually in the database, and etc.
But at this point, you have to not only know what's going on with your particular SQL dialect, you also have to know syntax hoops for whatever particular ORM you're using. Instead of even just language + SQL, you've got language + ORM + SQL , which is even worse for focusing on whatever I was trying to build in the first place!
Whereas, with a NoSQL solution, you're (probably) dealing with JSON, which works nicely and reliably and transparently in pretty much every language ever.
I'm drawn to mongo because it's easier to make changes to my data structures. I do wish, though, that it had a better query language. Something a little more structured. Some kind of structured query language. Queries-as-nested-json structures are kind of annoying.
I'd say the problem is less that it's inflammatory and more that it's totally useless. NoSQL as it is used encompasses a whole array of totally different technologies, everything from a bigtable style column store to redis, to memcachedb.
doesn't it worry you, then, that your reasons for selecting the technology are unconnected with the big differences noted above? or do you think that "different beasts" boils down to what language you use?
no, in my experience building real world stuff, they are both data stores, and pretty similar beasts; most of the docs I see arguing one way or another focus on, imho, silly nitpicky differences.
It doesn't mean to never use SQL, nor that SQL is bad — it means to widen your horizons; to not immediately assume that whatever you want to store, the best way to do that must without a doubt be in an RDBMS.
For years now, people have been using SQL as the default way to store things — and to very rarely use anything else. NoSQL simply means that there are other options. Widen your horizons.
I suppose it isn't that inflammatory if you have an uncommonly liberal interpretation of the meaning of the word "no". Most of the time it's used with a meaning similar to "not" or "none", rather than "yes, if it's the better way".
But to play devils advocate, doesn't advocating using a NoSQL solution as the default storage option suffer from the same fallacy?
For many of the people on both sides of this argument, I think they are both right in that the solution their promoting is optimal in certain contexts.
For many of the people on both sides of this argument, I think they are both wrong in that their advising using a tool across the board, independent of the context of its use.
Also, "good programmers know how to index properly" and "he's a dba that's scared of his career disappearing" are both ad hominem arguments. Let's stick to facts people.
NoSQL isn't the default storage option. It is a storage option.
NoSQL means this: "Hold your horses, are you sure SQL is the right option for storing that? There are other options."
In any debate, extremist views will tend to get heard as much, or probably even more, than the rational ones, but the concept didn't start out irrational, it started out the way I just described it.
Then came the nutbags who claim that either SQL never ever scales, or that SQL is in fact the only way to store anything at all in the real world and that it fits every use case in the known universe, perfectly.
That doesn't sound right. Every significant NoSQL deployment I've heard of was either a replacement for a relational DB or is a task that can be solved by a relational DB with some work. This includes Google's usage of BigTable, Facebook's usage of Cassandra, Digg's usage of Cassandra as well as reddit's.
Which deployments are you thinking of when you say NoSQL DBs are not competitive with relational ones?
I believe the OP was referring to the fact that if you are replacing a RDBMS with a NoSQL solution then chances are the RDBMS was not really a good choice in the first place. If you need the CA section of the CAP triangle and can't pass a possible inconsistency to the application level then a RDBMS on a big honking box with master/slave replication is probably your best bet -- a lot of the NoSQL traction has been in providing solutions to people who don't specifically need that part of the equation but fell into a RDBMS because it was the default/traditional solution to "I have some data, now how do I access and store it?" question or because they did not know that options which relaxed consistency assurances were not quite the heresy their DBAs claimed.
That's strange that you suggest master/slave replication as a way to achieve consistency and availability. If you lose your master, you can't take writes and are thus no longer available. And if your slaves are behind the master, then reads from the slave are stale, definitely not consistent with the master.
Many NoSQL proponents are endlessly gunning for the RDBMS, always counterpointing NoSQL solutions with the RDBMS, but then when confronted with specifics they throw their hands up and say "Different tools, man! Horses for courses!"
I'm not sure you're really paying attention to the real debate here.
Most of the people using non-relational data stores (for actual work) are employing it in tandem with traditional RDBMS solutions. And those that are feeding on a strictly NoSQL diet are usually just toying around or using Fisher-Price wrappered versions of NoSQL like App Engine's Datastore or Amazon's SimpleDB.
Maybe people who are using non-relational data stores for actual work are employing it in tandem with RDBMS but most talk is about completely replacing RDBMS with NoSQL. This latest round of fun started with Digg replacing MySQL with Cassandra. Even other HN comments on this very article don't talk about using it tandem with SQL; most would probably consider that a waste of the advantages of NoSQL.
I use non-relational solutions combined with a RDBMS solutions but that seems to put me on the other side of the debate.
Eh, I am old enough to remember when SQL was the new kid on the block and received nothing but disdain from "database experts" as it was considered toy-like and not suitable for efficiency and large data sets. I distinctly recall meetings with DBAs where SQL (and the RDBMS platforms) were ridiculed as PC-weenie toys for kiddie boxes. That no way could accommodate mission critical enterprise production systems.
Now, ~20 years later, it's the SQL advocates playing the same role as those grizzled old champions of mainframe fare hierarchical and VSAM-like structures.
A lot of people drive cars with manual transmissions because they enjoy driving them, not necessarily because they can gain any tangible benefit beyond enjoyment.
I think the same may be true in many cases with NoSQL. I want to try something different, and it's going to perform better for what I'm doing anyway, so why not.
In some experiments at work we got SQL server to do a bulk import of around 1 million rows (with quite a few interspersed reads) in around 5m 30s. We got the same inserts to happen in Mongo in less than a minute. So, we went with Mongo. Do we need that speed? Maybe, maybe not. It sure has been fun to try something new, though.
On the other hand, many people also drive vehicles with manual transmission because doing so yields measurable improvements in both power output and fuel economy. Vehicles with manual transmissions also tend to be more predictably controlled in adverse weather conditions. Interestingly, but not necessarily surprisingly, these points translate quite directly into the NOSQL argument.
The downside to the manual transmission is simply that when you don't care to exert more driving effort in a given situation that doesn't benefit enough from the increase in enjoyment, power, economy, and control, you're better off with an automatic in that situation. As before, this aspect also translates rather nicely. :)
I disagree with the analogy. An ORM is like automatic transmission -- it hides the details of switching gears but the gears are still there. SQL is manual transmission: You have maximum flexibility to access your data in a variety of ways with reasonable performance. NoSQL then is like having no transmission at all! It's simpler, as there are no gears to shift and no clutch, and it's faster under very specific conditions. Just as NoSQL has no complex query API and is optimized to fetch your data in a very specific way.
One thing I've noticed is that the "NoSQL" side of the debate (such as it is) always beats on MySQL as an example of a relational database.
MySQL, for all its virtues, is like the little kid brother of real RDBMS systems. Getting slightly improved performance opposed to a NoSQL system is NOT going to convince someone with a behemoth Oracle rack to switch.
I'd like to see some systems and benchmarks showing how NoSQL systems can compete in the ridiculously high-volume, high-speed space. And yes, I know Google does with BigTable, and that it's theoretically possible, but I haven't seen any implementations prove they can, yet.
Personally I found Forbes post re: SSD to be very interesting, and rather than just claim the debate has denigrated past the point of usefulness -- just ignore the rant from the other guy and respond to the substantive comments made by Forbes.
As a person who is investigating NoSQL as viable for a particular task, the substantive posts on either side have been valuable to me.
Most of the solutions people are talking about when they mention NoSQL are not replacements for SQL databases. They are completely different beasts. I suppose if we're going to use strained metaphors I'd say SQL vs. NoSQL is akin to diesel vs. a V12. That is they are both fine technologies and, except for the very loosest of definitions, they do completely different things.
If you think NoSQL and SQL are actually battling, you need to do more reading...