at that point, it seems easier to run a slightly worse model locally. (or on a r...

rimeice · 2025-11-06T17:00:51 1762448451

Which is apples own approach until the compute requirements need them to run some compute on cloud.

bigyabai · 2025-11-06T19:43:12 1762458192

Just a shame they spent so long skimping on iPhone memory. The tail-end of support for 4gb and 6gb handsets is going to push that compute barrier pretty low.

brookst · 2025-11-07T03:22:52 1762485772

Eh, maybe a bit, but those era devices also have much lower memory bandwidth. I suspect that the utility of client models will rule out those devices for other reasons than memory.

bigyabai · 2025-11-07T20:15:36 1762546536

> much lower memory bandwidth

Not really? The A11 Bionic chip that shipped with the iPhone X has 3gb of 30gb/s memory. That's plenty fast for small LLMs if they'll fit in memory, it's only ~1/3rd of the M1's memory speed and it only gets faster on the LPDDR5 handsets.

A big part of Apple's chip design philosophy was investing in memory controller hardware to take advantage of the iOS runtime better. They just didn't foresee any technologies beside GC that could potentially inflate memory consumption.