Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Lesson Learned from Queries over 1.3T Rows of Data (pingcap.com)
111 points by jinqueeny on Aug 21, 2019 | hide | past | favorite | 15 comments


Any database experts here? My question is should we use relational database for such use case? Quora like app I mean.


Always use a relational database. Make sure to use one that supports recursive queries, and that doesn't suck.

If you need to shard your database, do that.

For example, say you're building a ridesharing app. You'll have databases of users and drivers, but you can totally shard the driver DB by city, and even the user DB can be sharded this way (copying into a shard an entry from a main truth db). The DB of available cars and rides can be in-memory only -- if you lose it you can let the client-side apps drive recovery. You can use things like FDW to get a view that looks like a single DB, and then you can write queries purely in SQL. Look ma', no hand-coded joins and such.


Hey Thanks. Do you know any resources or git repo with possible demonstration of these kinds of implementation? (Even what to search would be appreciated) I'm backend guy but average when it comes to database.

Thanks mate!


I mean Stackoverflow is bigger than Quora and uses a relational database for storing most everything if I recall. They also use either memcached/redis (too lazy to check) to do caching of things pretty heavily though.


Their tech stack is incredible. They run an extremely lean operation. I'm an AWS and Linux fanboy and in fact have essentially staked 5-10 years of my career on AWS being the "go-to" technology. But if you have talent like Nick Craver and crew onboard, you have every reason to build and maintain your own hardware.

Not only that, it's all .NET/MSSQL. It's incredible technology and they are awesome at it.

https://nickcraver.com/blog/2016/03/29/stack-overflow-the-ha...


It's actually a typical stack.


What is so incredible about it?


The fact that they can serve their entire user base with so little hardware.


But what are they doing that enables that other than putting standard components together and not introducing crazy bottlenecks? What part is incredible technology? Maybe modern computers and software just enable far more throughput than most people realize.


I think you would be surprised about the scaleability of Relational Databases. Most of the large scale applications you use rely on Relational Databases, even at the largest scale. Modern distributed SQL system such as TiDB or the open source database Vitess are capable of serving any amount of traffic and they are really easy to use and very well understood.

In fact Quora uses a relational database


...Vitess is sharded on top of MySQL. It literally only exists because MySQL doesn't scale.

I will agree, though, that for the vast majority of use cases, worrying about using a so-called "scalable" DB is a YAGNI-type concern, and the normal options should be more than sufficient.


Why not? I can think of one reason: SQL is generally poor at modeling tree structures. This may reflect itself in the application with comments without hierarchies (which is the case for StackOverflow and Quora), but that is a more minor part of the application. Similarly if you wanted to have a graph of relations between knowledge, graph databases might be better. But QA sites tend to just add tags to the questions, which is easy enough to model.


The tradeoff is sharding and caching graph DBs is difficult in terms of the performance hit because it's harder to isolate queries to a single partition/subgraph - in some ways intersectionality is the whole point of graph vs relational.

The overhead of a large scale distributed graph DB with redundancy and caching is significantly higher than a relational DB acting as a "poor man's graph DB"; and materializing your more expensive views and queries is, in the base case, a sufficient performance equalizer.


Blank page with JS disabled. Reader mode disabled.


"Reader mode disabled."

That's a bug. Shouldnt be possible. Let the site require executing JS to get the txt, but the web browser should have no code to attempt to limit the user.... and yet they are _full_ of it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: