For JOINs? Absolutely! Who wants to hand-code queries at the executor level?! It...

otterley · 2025-07-21T16:52:53 1753116773

To be clear here, I'm not arguing that OpenSearch/ElasticSearch is an adequate substitute for Postgres. They're different databases, each with different strengths and weaknesses. If you need JOINs and ACID compliance, you should use Postgres. And if you need distributed search, you should use OpenSearch/ElasticSearch.

Unless they're building for single-host scale, you're not going to get JOINs for free. Lucene (the engine upon which ES/OS is based) already has JOIN capability. But it's not used in ES/OS because the performance of JOINs is absolutely abysmal in distributed databases.

cryptonector · 2025-07-21T17:42:54 1753119774

I'm arguing that sometimes you don't need ACID, or rather, sometimes you accept that ACID is too painful so you accept not having ACID, but no one ever really doesn't want a QL -- they only think that they don't want a QL until they learn better.

I.e., NoACID does not imply NoQueryLanguage, and you can always have a QL, so you should always get a QL, and you should always use a QL.

> Unless they're building for single-host scale, you're not going to get JOINs for free.

If by 'free' you mean not having to code them, then that's wrong. You can always have or implement a QL.

If by 'free' you mean 'performant', then yes, you might have to denormalize your data so that JOINs vanish, though at the cost of write amplification. But so what, that's true whether you use a QL or not -- it's true in SQL RDBMSes too.

philippemnoel · 2025-07-21T17:10:36 1753117836

Our customers typically deploy ParadeDB in a primary-replicas topology, with one primary Postgres node and 2 or more read replicas, depending on read volume. Queries are executed on a single node today, yes.

We have plans to eventually support distributed queries.

strbean · 2025-07-21T17:25:55 1753118755

Obligatory whine that the term NoSQL got co-opted to mean "no relational". There's tons of space for a better query language for querying relation databases.

derefr · 2025-07-21T17:21:40 1753118500

It's funny; as someone who is exactly pg_search's market, I actually often want the opposite: ACID, MVCC transactions, automatic table and index management... but no query language.

At the data scale + level of complexity our OLAP queries operate at, we very often run into situations where Postgres's very best plan [with a well-considered schema, with great indexes and statistics, and after tons of tuning and coaxing], still does something literally interminable — not for any semantic reason to do with the query plan, but rather due to how Postgres's architecture executes the query plan[1].

The last such job, I thought would be simple enough to run in a few hours... I let it run for six days[2], and then gave up and killed it. Whereas, when we encoded the same "query plan" as a series of bulk-primitive ETL steps by:

1. dumping the raw source data from PG to CSV with a `COPY`,

2. whipping out simple POSIX CLI tools like sort/uniq/grep/awk (plus a few hand-rolled streaming aggregation scripts) to transform/reduce/normalize the source data into the shape we want it in,

3. and then loading the resulting CSVs back into PG with another `COPY`,

...then the runtime of the whole operation was reduced to just a few hours, with the individual steps completing in ~30 minutes each. (And that's despite the overhead of parsing and/or emitting non-string fields from/to CSV with almost every intermediate step!)

Honestly, if Postgres would just let us program it the way one programs e.g. Redis through Lua, or ETS tables in Erlang — where the tables and indices are ADTs with low-level public APIs, and you set up your own "query plan" as a set of streaming-channel actors making calls to these APIs — then we would be a lot happier. But even in PL/pgSQL (which we do use, here and there), the only APIs are high-level ones.

• Sure, you can get a cursor on a query; but you can't e.g. get an LMDB-like B-tree cursor on a target B-tree index, and ask it to jump [i.e. re-nav down from root] or walk [i.e. nav up from current pos to nearest common ancestor then back down] to "the first row-tuple greater-than-or-equal to [key]".

• You can't write your own efficient implementation of TABLESAMPLE semantics to set up your own Bigtable-esque balanced cluster-order-partitioned parallel seq scan.

• You can't collect pointers to row-tuples, partially materialize them, filter them by some criterion on the read (but perhaps not parsed!) columns, and then more-fully materialize those same row-tuples "directly" from the references to them you still hold.

---

[1] One example of what I mean by "execution": did you know that Postgres doesn't use any form of concurrency for query plans — not even the most basic libuv-like "This Merge Append node's child-node A is in a blocking-wait on IO; that blocking-wait should yield, so that the Merge Append node's child-node B can instead send row-tuple batches for a while" kind of concurrency?

---

[2] If you're wondering, the query that ran for six days was literally just this (anonymized):

    SELECT a, b, SUM(value) AS total_value
    FROM (
      SELECT a, b, value FROM source1
      UNION ALL
      SELECT a, b, value FROM source2
    ) AS u
    GROUP BY a, b;

`source1` and `source2` are ~150GB tables. (Or at least, they're 150GB when dumped to CSV.) Two integer keys (a,b), and a bigint value. With a b-tree index on `(a,b) INCLUDE (value)`, with correct statistics.

And its EXPLAIN query plan looked like this (with `SET enable_hashagg = OFF;`) — nominally pretty good:

    GroupAggregate  (cost=1.17..709462419.92 rows=40000 width=40)
      Group Key: a, b
      ->  Merge Append  (cost=1.17..659276497.84 rows=6691282944 width=16)
            Sort Key: a, b
            ->  Index Only Scan using source1_a_b_idx on source1  (cost=0.58..162356175.31 rows=3345641472 width=16)
            ->  Index Only Scan using source2_a_b_idx on source2  (cost=0.58..162356175.31 rows=3345641472 width=16)

Each one of the operations here is "obvious." It's what you'd think you'd want! You'd think this would finish quickly. And yet.

(And no, the machine it ran on was not resource-bottlenecked. It had 1TB of RAM with no contention from other jobs, and this PG session was allowed to use much of it as work memory. But even if it was spilling to disk at every step... that should have been fine. The CSV equivalent of this inherently "spills to disk", for everything except the nursery levels of sort(1)'s merge-sort. And it does fine.)

cryptonector · 2025-07-21T17:50:32 1753120232

> At the data scale + level of complexity our OLAP queries operate at, we very often run into situations where Postgres's very best plan [with a well-considered schema, with great indexes and statistics, and after tons of tuning and coaxing], still does something literally interminable — not for any semantic reason to do with the query plan, but rather due to how Postgres's architecture executes the query plan[1].

Well, ok, this is a problem, and I have run into it myself. That's not a reason for not wanting a QL. It's a reason for wanting a way to improve the query planning. Query hints in the QL are a bad idea for several reasons. What I would like instead is out-of-band query hints that I can provide along with my query (though obviously only when using APIs rather than `psql`; for `psql` one would have to provide the hints via some \hints commnad) where I would address each table source using names/aliases for the table source / join, and names for subqueries, and so really something like a path through the query and subqueries like `.<sub_query_alias0>.<sub_query_alias1>.<..>.<sub_query_aliasN>.<table_source_alias>` and where the hint would indicate things like what sub-query plan type to use and what index to use.

derefr · 2025-07-21T20:28:51 1753129731

I mean, in my case, I don't think what I want could be implemented via query hints. The types of things I would want to communicate to the server, are pragmas entirely divorced from the semantics of SQL: pragmas that only make sense if you can force the query's plan to take a specific shape to begin with, because you're trying to tune specific knobs on specific plan nodes, so if those plan nodes aren't part of the final query, then your tuning is meaningless.

And if you're pinning the query plan to a specific shape, then there's really no point in sending SQL + hints; you may as well just expose a lower-level "query-execution-engine abstract-machine bytecode" that the user can submit, to be translated in a very low-level — but contractual! — way into a query plan. Or, one step further, into the thing a query plan does, skipping the plan-node-graph abstraction entirely in favor of arbitrarily calling the same primitives the plan nodes call [in a sandboxed way, because such bytecode should be low-level enough that it can encode invalid operation sequences that will crash the PG connection backend — and this is fine, the user signed up for that; they just want to be assured that such a crash won't affect data integrity outside the current transaction.]

Such a bytecode wouldn't have to be used as the literal compiled internal representation of SQL within the server, mind you. (It'd be ideal if it was, but it doesn't need to be.) Just like e.g. the published and versioned JVM bytecode spec isn't 1:1 with the bytecode ISA the JVM actually uses as its in-memory representation for interpretation — there's module-load-time translation/compilation from the stable public format, to the current internal format.

cryptonector · 2025-07-22T03:16:37 1753154197

But your mental model of your query is still in a language, even if it's only natural language. Why wouldn't you write a QL and compiler for it that outputs a query plan AST/bytecode/whatever to your liking? The PG SQL compiler and query planner just isn't to your liking, but you really want to be writing queries by hand? I guess what you're saying is you want something like LinkQ that lets you build complex plans/ASTs w/o the complexity of NoSQL hand-coded queries.

cryptonector · 2025-07-22T20:04:25 1753214665

Oh, and BTW, PG is getting async I/O in the next release. It's not the same as concurrency, but if your workloads are I/O-bound (and likely they are) then it's as good as concurrency.