Waiting for SQL:202y: Group by All

elygre · 2025-11-16T20:08:21 1763323701

Let me reference fields as I create them:

  select xxxxx as a
       , a * 2 as b

zX41ZdbW · 2025-11-16T21:49:58 1763329798

This will be great! One of the things ClickHouse has had since 2016.

cyberax · 2025-11-16T23:29:49 1763335789

SQL needs to have `select` as the _last_ part, not the first. LINQ has had this for 2 decades by now: "from table_a as a, table_b as b where ... select a.blah, b.duh".

cryptonector · 2025-11-17T00:19:54 1763338794

This is not relevant to GP's point. This is a separate topic, which... I don't really care, but I know a lot of people want to be able to write SQL as you suggest, and it's not hard to implement, so, sure.

Though, I think it might have to be table sources, then `SELECT`, then `WHERE`, then ... because you might want to refer to output columns in the `WHERE` clause.

snuxoll · 2025-11-17T02:21:01 1763346061

WHERE clauses are pushed down into the query planner before the SELECT list is processed, that’s why HAVING exists.

The logical order, in full, is:

FROM

WHERE/JOIN (you can join using WHERE clauses and do FROM a,b still)

SELECT

HAVING

1718627440 · 2025-11-17T10:48:55 1763376535

That's the order in which the processing happens, but this doesn't need to be reflected in the language. The language has this ordering so it sounds like a natural language which SQL was invented for.

cryptonector · 2025-11-17T21:12:09 1763413929

See u/cyberax's comment below. It would be nice to be able to create scalar (as opposed to table-valued) bindings that can be referred to in a WHERE (or JOIN) clause. Currently it's SELECT that establishes such bindings, and... well, it's not terribly clear where they can be used (certainly in HAVING, but first you have to GROUP BY, no?). u/cyberax's idea is to have a LET for this that can come before WHERE and before SELECT.

snuxoll · 2025-11-20T01:58:16 1763603896

I mean, I get it, but the big problem is, again, the different phases of execution. The projections you perform with a select can be absolutely arbitrary and do crazy ass things (like do more subqueries that return scalar values, and query planners are notoriously bad at pushing these down), which is why I was trying to say SELECT before WHERE (project before filtering) may be linguistically intuitive, but full of foot guns.

Something like a ‘let’ binding after the FROM/JOIN list would make sense, though - from the query planners perspective it’s nothing more than a token substitution and everything would compile the same.

cyberax · 2025-11-17T01:33:42 1763343222

Ideally, it needs to be "from", then arbitrary number of something like `let` statements that can introduce new variables, maybe interspersed with where-s, and then finally "select".

"select" can also be replaced with annotations, something like: `from table_1 t1 let t1.column_1 as @output_1 where ...` and then just collect all the @-annotated variables.

I need to write a lot of SQL, and it's so clumsy. Every time I need a CTE, I have to look into the documentation for the exact syntax.

1718627440 · 2025-11-17T10:49:42 1763376582

> Ideally, it needs to be "from", then arbitrary number of something like `let` statements

Isn't that what a CTE is?

cryptonector · 2025-11-17T21:09:55 1763413795

Not quite. u/cyberax wants scalar bindings, not table-valued bindings.

Something like

  FROM foo
  LET a = (x + y) * z
  SELECT a;

whereas CTEs are... Common Table Expressions.

tracker1 · 2025-11-17T16:26:54 1763396814

That was kind of my first thought...

viraptor · 2025-11-17T18:15:35 1763403335

https://prql-lang.org/ and compile to SQL.

cyberax · 2025-11-18T07:21:32 1763450492

Thank you! This is indeed close to what I want from SQL!

agnosticmantis · 2025-11-17T04:30:11 1763353811

The Pipe Query Syntax in GoogleSQL implements this elegantly as well:

https://docs.cloud.google.com/bigquery/docs/reference/standa...

jiggawatts · 2025-11-17T09:56:12 1763373372

Also in the Kusto Query Language (KQL) as used by Azure Log Analytics.

Exuma · 2025-11-16T19:27:06 1763321226

Also just let me reference the damn alias in a group by, FUCK

sbuttgereit · 2025-11-16T20:05:54 1763323554

At least in PostgreSQL, both by alias and ordinal are possible:

  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:02:15 PST 2025]
  > create table test (a_key integer primary key, a_group integer, a_val numeric);
  CREATE TABLE
  Time: 3.102 ms

  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:02:25 PST 2025]
  > insert into test (a_key, a_group, a_val) values (1, 1, 5.5), (2, 1, 2.6), (3, 2, 1.1), (4, 2, 6.5);
  INSERT 0 4
  Time: 2.302 ms

  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:02:58 PST 2025]
  > select a_group AS my_group, sum(a_val) from test group by my_group;
   my_group | sum
  ----------+-----
          2 | 7.6
          1 | 8.1
  (2 rows)
  
  Time: 4.124 ms
  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:03:15 PST 2025]
  > select a_group AS my_group, sum(a_val) from test group by 1;
   my_group | sum
  ----------+-----
          2 | 7.6
          1 | 8.1
  (2 rows)
  
  Time: 0.360 ms

mberning · 2025-11-16T19:30:57 1763321457

Some do. It would also be nice to reference by ordinal number similar to order by. Very handy for quick and dirty queries. I can see the issue though that people start to lean on it too much.

petereisentraut · 2025-11-17T11:32:19 1763379139

The problem with this and similar requests is that it would change the identifier scoping in incompatible ways and therefore potentially break a lot of existing SQL code.

zX41ZdbW · 2025-11-16T21:50:55 1763329855

I think it should be not only in GROUP BY, but in every context, e.g., inside expressions in SELECT, WHERE, etc.

kermatt · 2025-11-16T21:31:35 1763328695

PostgreSQL and DuckDB support this, which makes MSSQL feel like a dinosaur in context.

theodpHN · 2025-11-17T00:14:46 1763338486

So, why not a SORT BY ALL or a GROUPSORT BY ALL, too? Not always what you want (e.g., when you're ranking on a summarized column), but it often alphabetic order on the GROUP BY columns is just what the doctor ordered! :-)

petereisentraut · 2025-11-17T11:29:10 1763378950

The working group also discussed ORDER BY ALL, but for some reason most participants really did not like it.

oulipo2 · 2025-11-16T22:42:37 1763332957

Not directly related, but I saw this project recently of a data language by google which is quite cool https://www.malloydata.dev/

cm2187 · 2025-11-16T21:42:44 1763329364

Snowflake has that, once you start using it, it's painful to go back.

sixtram · 2025-11-17T05:45:26 1763358326

What about reusing a CTE? Let me import a CTE definition so that it can be used throughout my app, not just in the current context.

jklowden · 2025-11-17T14:23:10 1763389390

I believe that’s what we call a "view".

chewxy · 2025-11-16T22:57:41 1763333861

BigQuery has that and I've been loving using it since they introduced it

elchief · 2025-11-16T22:13:47 1763331227

duckdb has it

https://duckdb.org/docs/stable/sql/query_syntax/groupby

parpfish · 2025-11-17T01:54:44 1763344484

this seems to ignore the fact that you can group by a column that isn't in the select statement.

it's not something that i've found a particular use for, but it IS a thing you can do.

Inviz · 2025-11-16T23:57:44 1763337464

What's wrong with GROUP BY 1,2,3?

SigmundA · 2025-11-16T23:34:40 1763336080

SELECT * EXCEPT(col_name) next please.

petereisentraut · 2025-11-17T11:26:58 1763378818

This was also discussed at the last SQL WG meeting but was postponed for further refinement. But it’s likely to be added soon.

azurezyq · 2025-11-17T01:51:51 1763344311

BigQuery has it! https://docs.cloud.google.com/bigquery/docs/reference/standa...

SigmundA · 2025-11-17T03:10:33 1763349033

Yes it needs to be in the standard though.

1718627440 · 2025-11-17T10:46:56 1763376416

That might be nice for manual experimentation, but for application use, this seems brittle compared to specifying the columns you really want to have and process.

dorianmariecom · 2025-11-16T20:21:45 1763324505

would be nice

wvbdmp · 2025-11-17T01:16:17 1763342177

What? No! I want GROUP BY * and more importantly GROUP BY mytable.*