Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
TC39: Add Object.groupBy and Map.groupBy (github.com/tc39)
80 points by moritzwarhier on Dec 19, 2023 | hide | past | favorite | 61 comments


In case you need a example, there is a good one here: https://github.com/tc39/proposal-array-grouping/


What's the thinking behind attaching Object.groupBy to the Object constructor? With the other Object.X() functions I can easily see why it makes sense they exist on the Object constructor, because they generally follow the pattern of taking an object as argument, and say, freezes that object, returns that object's [[Prototype]], returns that object's keys, check if that object is sealed, etc. By contrast, Object.groupBy is a utility function that takes an iterable and a callback, and doesn't seem to have anything distinctively to do with "Object" per se.

(I guess one exception is Object.is, which takes two values as arguments. But even Object.is makes a lot of sense existing on the Object constructor because it exposes an important abstract operation (SameValue). Object.groupBy is only a utility function)


We originally tried to put it on Array.prototype, under multiple different names, and that broke various pages in various ways. So we gave up and had to put it somewhere else. And it's conceptually similar to `fromEntries` - they're both ways of making an object. So the Object constructor was the obvious choice.


This is why we need a global Iterator / Iterable type. These are iterators at the end of the day after all. That would also signal the fact they could be used for more than one data structure (Array, Map, POJOs)


Global iterator type is coming: https://github.com/tc39/proposal-iterator-helpers

But a method named `groupBy` on iterators traditionally means a different thing: https://github.com/tc39/proposal-array-grouping/issues/51#is...

Global iterable type it's too late for, since there's many extant iterables in the language and on the web which don't have it in their prototype chain and can't reasonably be changed.


Stop it please! Just add the pipleline proposal and we can use a library or define our own function.


it's a noble approach, but feels like it's at its limits if no one could find a reasonable name that works with Array. there's gotta be a point at which it makes sense to EOL third-party stuff like that: set a date far in the future, and any pages where document.lastModified is greater get Array.groupBy as well


Not gonna happen: https://github.com/tc39/faq?tab=readme-ov-file#why-dont-we-j...

Browsers aren't going to break old pages.


Well there is a standard way to support multiple versions of a JS library: imports.

I wonder is there is any proposal for a standard library with a well known URI.

A browser with ESM support should be able to provide its internal version or fallback with a polyfill using import maps. Something like: import {groupBy} from “std:arrays”

That can be also useful to standardize libs between Node,Deno,Bun without polluting the global namespace… and it’s backwards compatible


Browsers have broken old pages on a massive scale on several occasions. Numerous useful pages were effectively lost when they decided to shut down widely used plugins like Flash and Java applets that were mainstays of the interactive web for years. Google/Chrome in particular are notorious for pushing features before they're properly standardised, allowing developers to start relying on them, and then pulling them because they are no longer convenient.

A better answer here might be to give the language a version declaration feature so browsers could reliably provide a backward-compatible interpretation of old scripts if necessary. If breaking changes are only made when there is great value in the change and no reasonable alternative then they shouldn't happen often enough for this kind of versioning to become an unmanageable burden.


There's a big difference between breaking things because they're relying on impossible-to-secure plugins and breaking things because JavaScript developers want to invoke an API in a slightly different way.

Regarding versioning, see the immediately preceding question in the FAQ.


I read both FAQ answers. I just - with greatest respect - don't find them convincing.

The arguments about completely removing the plugins were always questionable. Yes they had security problems. Now the attack surface they represented has largely been moved to the browsers themselves, which also have security problems, often of a similar nature and for similar reasons. Plugins had stopped being a vehicle for drive-by downloads and the like thanks to better browser safeguards like click-to-play long before they were decisively removed.

Some of the most popular programming languages in the world behave significantly differently between major versions and have developed effective solutions to manage those differences. Sometimes those are as simple as a command line compiler flag or config file setting to specify a target language version. And of course the libraries used with many languages also frequently have to deal with complex dependency resolutions these days. I do not understand why JavaScript is special and could not adopt a similar model if the will was there.


Browsers are much, much better at addressing security issues than Flash or Java applets ever were. There's no comparison. And even if you think that they ought not to have been removed, surely it's clear that the concerns about security for users are a fundamentally different thing than JS developers wanting a different API shape.

As to versioning the language: languages like Rust and Go use editions, which allow them to making breaking changes to syntax but not to the standard library, which is what's being discussed here. Indeed Rust has several deprecated-but-unremovable things in their standard library. Python makes breaking changes to their standard library, and then people's code breaks. C++ requires you to specify the version for the entire program, which isn't viable for JS because pages mix scripts from dozens of authors, which all need to interact and to have a coherent view of the world. Not sure what other language you're thinking of.

The relatively unique object model in JS, and the fact that the standard library consists of ambient properties, also makes it special; it's harder in JS than it would be in a language like Rust to detect use of a particular feature of the standard library.


yes, what my comment proposes is a solution that avoids breaking the web, by referring to document.lastModified. in this scenario there's a transition period where only Object.groupBy is available, then after a few years Array.groupBy is also available for any pages modified after the cutoff date. static pages are spared, and servers + library devs have plenty of time to react

there is definitely an approach out there that breaks fewer pages than things like the flash EOL broke


A lot of shipped code is effectively unmaintained even on pages which are otherwise being modified, in many cases by relatively non-technical people (a small business owner contracts a webdev to build a page and show them how to update the content, for example).

And in a lot of cases it's not obvious that your code is going to be incompatible. For example the most recent case there was code which was like `function foo(x) { if (x.group) x = x.group; /.../}` where that function was being called with `foo({ group: [...] })` or `foo([...])`. That code breaks if `Array.prototype.group` is added. No one is going to figure that out themselves unless they test on a version of a browser which ships `Array.prototype.group`, and empirically no one does that.

Yes, we could find a way to break fewer pages than flash EOL broke. But even just a handful of pages people actually use is too much. No browser wants to ship an update which will break (to pick a relevant example) the website of the government of Brazil, even if they somehow "ought to have" updated their code before that update. Developers are empirically not going to do that update and browsers are not going to punish the users of the pages of those developers by breaking them.


Map.groupBy() returns a Map with the keys mapped to array values. In parallel, Object.groupBy() returns an Object with the keys cast into property names. Thus, I'd say it fits the pattern of Object-related static methods, since it's using the Object mechanism as a map.


The part I’m confused on is the documentation saying the returned object from Object.groupBy doesn’t extend from Object.prototype. Is it still an object?

    The return value of groupBy is an object that does not inherit from %Object.prototype%.


It's called a null-prototype object: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

They're "usually used as a cheap substitute for maps": the lack of prototype prevents the object from being polluted by keys that morally shouldn't be there, like toString, etc.


It's a value that belongs as a member of the Object data type (Object(it)===it), but it's not (in addition) an instance of the Object constructor (!(it instanceof Object)). I guess the former, wider sense of 'an object' is more often intended.


What about Object.fromEntries?

See https://github.com/tc39/proposal-array-grouping for why this isn’t a method on Array.prototype.


Constructs an object from kv pairs, definitely assoctiated with the "Object" prototype. Groupby is more dubious


When it comes to helper functions on existing types, I feel like you really can’t have too many… within reason.

If we added like four new kinds of object/map, that complicates things… but just formalizing more common access and iteration patterns for existing structures seems low risk.

What I can’t wait for are all the set operations that would make ‘Set’ truly powerful.


I hope they stick to these useful helper functions instead of adding more complexity like Type Annotations that provide no type soundness or runtime checks of any kind[0]

[0] https://github.com/tc39/proposal-type-annotations


That’s an interesting take when the stated purpose of that proposal is “to enable developers to run programs written in TypeScript, Flow, and other static typing supersets of JavaScript without any need for transpilation”.

That seems like a fine goal. Allow runtimes to execute JS with type annotations as-is.


The issue for me is four-fold:

1. Adds complexity to the language

2. Adds complexity to engines

3. Adds complexity to developers, especially new developers ("wait is it typed or not")

4. Most importantly, all but guaranties we will never have true types in JavaScript for things that could benefit from it like node or electron where instant compile time isn't necessary.

All this for a feature only helping some developers some of the time so they can run code that will be stripped out in production to reduce size anyway on their local browser a bit more easily.


Anyone know if `keyBy` will also be supported? I suppose it'd be pretty trivial to take a `groupBy` and make it `keyBy`, but then again it's pretty trivial to implement `groupBy` from scratch.

I no longer install lodash that often anymore


I'm a big fan of the design of Elixir's Enum.group_by, which has one-parameter and two-parameter forms. The one-parameter form is like any other language's groupBy — but the two-parameter form takes two mapping closures; passes each element into both of them; and uses the output of the first closure as the grouping key, and the output of the second closure as the value to be registered under that grouping key.

This flexible primitive enables you to do basically any grouping transform you want (incl. the Lodash-style keyBy) in a single short line of code.

It's a lot like a sortBy operation, in that to emulate it, you would have to do a map to extract the key, producing pairs; sort (or in this case group) the pairs; and then deep-transform the pairs inside the data structure by unwrapping the keys off them. In other words, it's something that's a bit too high-friction to reach for if the language doesn't just give it to you (you'd probably do what you're planning to do some other way); but if the language does give it to you, you'll use it quite often.


No current plans for `keyBy`, and I don't know that it's really that well-motivated. (I am on TC39.)


You can do that by nesting

  Array.prototype.map 
in the parameter for

  Object.fromEntries 
in an easy way (probably not as optimized as a built-in, but that might be irrelevant for most cases, since its just a duplicated iteration, not quadratic)


For reference, keyBy returns an object with the same keys as groupBy, but the value of a key is the last element to produce that key, instead of an array of all of the elements that did.


This seems like a nice addition. Is it soemthing that a lot of tools roll their own, or is something that can’t be done in user space? I’m relatively ignorant of the limits of js


It's fairly easy to roll your own. An example implementation of groupBy in lodash, for example: https://github.com/lodash/lodash/blob/4.17.15/lodash.js#L939....

More broadly, this is an example of a utility function for organizing data in a data structure, which JavaScript and all other major programming languages are well-equipped to implement in arbitrary ways, so this is just a convenience function. It's not like a hook into underlying environment APIs that can't be independently shimmed.


Trying to read lodash can (to put it mildly) be difficult. Lots of code reuse.

This is probably missing checking for a bunch of corner cases, but this is the general idea:

  // quick & dirty Object.groupBy
  function groupBy(iterable, callbackFn) {
    const out = {}
    for(const [key, value] in Object.entries(iterable)) {
      const name = callbackFn(value, key)
      (out[name] || (out[name] = [])).push(value)
    }
    return out
  }
I literally wrote something very like this yesterday (after checking & finding it only got added to Node 21, and deciding not to pull in a dependency).

Edit: had keyBy impl in here, fixed! Thanks commenters!


I think `out` should be typed as a map to an array in this case; each iteration should be appending to the corresponding array instead of overwriting the entire entry.


Oh yes, I agree.

I remember me running into the problem of internal dependencies when extracting parts of it into a non-ESM/non-build project some years ago...

Like, the internal consistency of the library probably benefits, and with tree-shaking it works reasonably well but it is still a poster example of DRY vs KISS...

It has its benefits, but its API surface can be annoying, too. When I encounter it, I often have to think for a second and look at the docs, e.g. some cryptic xor or map function with three parameters, one of which being an optional config object.

OTOH, it can be great to have a well-tested implementation at hand for all kinds of higher-order functions, like throttle, debounce etc


Aaannd… now you have a prototype pollution vulnerability :)

Object.from(null), always.


That‘s closer to keyBy than groupBy.


It's very funny to point to lodash as an example of simplicity. Here is what happens when you start unraveling the code you linked to:

https://gist.github.com/ricardobeat/040af80971273f5abd71c2bf...

I got tired, but wouldn't be surprised if the whole thing goes beyond 1000 lines. It's basically a black hole. Better hope you really, really trust their test suite, it's impossible to debug and there must only be a handful of people familiar with the whole codebase.


It is indeed something that you often roll your own, write an utility function for, or include libraries like lodash for.

JS standard library is notoriously deficient.

This is a good addition in my view too, especially including Map.

When dealing with DB/API results, one can of course always say that such grouping in the client is an anti-pattern.

But in reality it can be reasonable and useful.

If you write one-off algorithms in JS to transform small amounts of data on the client, you probably habe done this.

And even for data-heavy applications, it could prove to be a performance benefit for functions that use Map instances during computation.

Though that's just wishful thinking until the runtime developers choose to optimize these functions, if possible.


> When dealing with DB/API results, one can of course always say that such grouping in the client is an anti-pattern.

It depends on how many data you're working on. This groupBy is useful for filters in data tables.

If you have to download a zillion of records and filter them, client side filtering is a bad idea. If you have a few thousands of them, it could possibly be the faster solution overall.


Exactly.


You can roll your own or use a utility library. A simple zero-dependency library would be something like just-* [1]. Although I now prefer remeda [2] as it seems to have the best typescript support, especially the strict variants such as `groupBy.strict`.

[1] https://github.com/angus-c/just#just-group-by

[2] https://remedajs.com/docs#groupBy


It can absolutely be written in JavaScript (it is a Turing complete language after all), and it is a common utility function found in most general utility libraries like Lodash and Ramda.


It can be trivially rolled by hand, but it’s also a pretty common function in other languages. I’d guess most node programmers reach for lodash for this.


> a pretty common function in other languages

Except oddities like C, C++, Perl, PHP, and Go :)


Oh PHP programmers would just rely on MySQL's groupBy, we both know it! :P


This is one of those functions I end up implementing in pretty much every single non-trivial JS project I work on. Glad to see some movement here.


I would like to have .partition as well but I guess I can get by with groupBy.

Array.product would be sweet as well (i.e. like product in python's itertools).


I don't think they should do this especially since they don't even have an Array.prototype.peek()


On a side note, I really wish I could learn to tell exactly if a new feature like this is already supported in a project I'm working.

babel.config.js is just a mistery to me. @babel/present-env, targets: { node: 'current' }, forceAllTransforms, useBuiltIns, 'corejs', modules, @babel/plugin-proposal-object-rest-spread.

I mean, I generally dump it into Chat GPT and ask for an explainer, but... how can I know for sure I can use `array.reverse()` and it will be correctly handled in older browsers?

And how does the babel version in yarn.lock relate to all this?

What about the .browserslistrc which contains 'defaults' ?

My god.


https://compat-table.github.io/compat-table/es2016plus/

Enable the "show obsolete platforms" checkmark to see the data for older versions of Babel/core-js. But probably you should just upgrade that instead.


Holy moly, the static method is pretty horrible. Quite sad.


Nice, the JS standard library is gradually getting functions I take for granted in ClojureScript. :)

I noticed this part in the proposal

  groupBy calls callbackfn once for each element in items, in ascending order, and constructs a new Object of arrays. Each value returned by callbackfn is coerced to a property key
Does JS allow arbitrary objects as keys? I am asking because `group-by` in ClojureScript is quite flexible. e.g. you can find anagrams like so

  (def words ["meat" "mat" "team" "mate" "eat" "tea"])
  
  (group-by set words) ; set is a function that creates sets from collections
  
  ;; => {#{\a \e \m \t} ["meat" "team" "mate"]
  ;;     #{\a \m \t}    ["mat"]
  ;;     #{\a \e \t}    ["eat" "tea"]}
I am wondering how one could translate this using Object.groupBy as specified in the proposal.


JS allows arbitrary values (including objects/references) as keys in Maps, but not in objects, there it is cast to string by default (e.g. "object [Object]", "null" or "undefined", this is not intended usage of course).

The symbol primitive is also allowed as an object key, using square bracket access.

And arrays are "exotic objects" that have some special behaviors around their keys (auto-updating length property)


JS does indeed allow mapping arbitrary objects to keys. However, it is the memory address of the object that is hashed, not the semantic value. So, for example, if you do:

    objects = {}
    items1 = ["a", "b"]
    items2 = ["a", "b"]
    objects[items1] = 1
    objects[items2] = 2
then objects[items1] will return 1, and objects[items2] will return 2, but objects[items1] !== objects[items2].

EDIT: Sorry, this only works for dictionaries that are Maps, not Objects! See the responses. My fuzzing about in the console led me astray; you should start with `objects = new Map()` instead of `objects = {}`.


Actually that only works with Map. For plain objects the key is always the stringified cast of the key.

    > o = { [{}]: 1 }
    { '[object Object]': 1 }
    > k = { toString() { return 'a' } }
    { toString: [Function: toString] }
    > o = { [k]: 1 }
    { a: 1 }


Right ... and it seems it's not even possible to simulate object keys using Proxy traps:

  new Proxy(
    {},
    {
      get(target, property) {
        return typeof property
      },
    },
  )[{}] === 'string'


This is not true. I think you're confusing JS objects with Maps. This will just coalesce the key to string and overwrite one element of the object with the other.

With a map it works like you described:

    objects = new Map()
    items1 = ["a", "b"]
    items2 = ["a", "b"]
    objects.set(items1, 1)
    objects.set(items2, 2)


Ah, I didn't know Map in JavaScript allowed arbitrary keys whereas Object always serializes to strings. I guess that is the reason for having a Map.groupBy in the proposal.

Thanks for taking the time to explain, everybody


You ClojureScript people seem to be stalking me :-D

I've wanted to learn Clojure for years but haven't found the right reason, until discovering Logseq. Such a cool language!


> Does JS allow arbitrary objects as keys?

Object doesn't, but Map does. It's generally a good idea to use Map anyway for dynamic keys.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: