While SQL is great, complex analytical queries are much easier to write, read and modify when they are written in pipe-based syntax because of the reasons outlined in the announcement:
- Pipe-based query is executed in clear sequential stages according to the pipes written in the query. SQL execution stages aren't clear enough.
- SQL requires mentioning GROUP BY fields twice - at SELECT and GROUP BY statements. This complicates modifying the query, since you need to modify the set of GROUP BY fields at SELECT too.
- There are no limits on the combination and the number of pipes, which can be used in a single query. There are various limits in SQL. For example, you cannot use WHERE after GROUP BY (there is HAVING workaround, but it looks unnatural). Another example - you cannot easily use GROUP BY multiple times with arbitrary additional calculations and transformations in the middle (of course, you can use subqueries as a workaround for these limitations, but this will complicate the resulting query too much).
It's insufficient to say that something is objectively better. Often times user preferences are multi-modal, and people cluster around groups. Pipe-based database query syntax may be objectively "easier" by some mathematical deduction, but by the laws of statistics > 50% of the userbase may prefer a different syntax.
- Pipe-based query is executed in clear sequential stages according to the pipes written in the query. SQL execution stages aren't clear enough.
- SQL requires mentioning GROUP BY fields twice - at SELECT and GROUP BY statements. This complicates modifying the query, since you need to modify the set of GROUP BY fields at SELECT too.
- There are no limits on the combination and the number of pipes, which can be used in a single query. There are various limits in SQL. For example, you cannot use WHERE after GROUP BY (there is HAVING workaround, but it looks unnatural). Another example - you cannot easily use GROUP BY multiple times with arbitrary additional calculations and transformations in the middle (of course, you can use subqueries as a workaround for these limitations, but this will complicate the resulting query too much).
I recommend reading the following short migration guide from SQL to pipe-based LogsQL (query language optimized for typical queries over logs), in order to feel the difference - https://docs.victoriametrics.com/victorialogs/sql-to-logsql/