My guess is that autoregressive models can use Key Value (KV) caching to eliminate most of the FLOPs inside the self-attention block. Can't use KV caching inside diffusion (because it's not a causal model) but they sell this as a win anyway because they believe it leads to better reasoning.
I've thought for a long time that offering only electric vehicles with 200+ mile range as the base availability is overkill. I drive a short commute to work and then maybe 2mi to the grocery store and I have no other needs. Otherwise I take plane/train.
The common arguments I hear are
1. What if I need to take a roadtrip?
2. What if I don't have accessibility to a charger at home or work and need longer range to account for that.
Only (2) seems reasonable to me, but many do have access where they live. Seeing as the huge expense of EVs is batteries, I'd love the option of something with a much, much reduced battery (and the additional reduced feature sets the article mentions).
The reasonable way is to buy a PHEV and use the fuel tank as range extender for occasional extended roadtrips or when chargers are unavailable. However that may or may not be desirable when it comes to regulatory concerns (taxes etc.).
But even so, these extra tanks and ICE engines contribute to the cost and weight of the vehicle when the objective for me is to minimize cost given that I don't drive more than 100mi/wk. I could do with with 100mi of total range or less without need for an ICE generator.
for the people that want roadtrips and incredibly large range. Sadly it seems like many of the cars in the article are plopping ICE generators on top of EVs with already 300mi of range.
So your answer to the question is "because they can?" I don't think we needed the LLM age for us to be able to say that. That is to say, I think the article is giving the birds a bit more agency than next-token prediction engines.
No, reinterpret_cast doesn't change the type of the underlying object. The rule in [basic.lval]/11 (against accessing casted values) applies to all glvalues, whether they come from pointers or references.
How does this compare to commercial tools? I read through the original Titan-2 paper and they were 5X faster in routing but achieved worse QoR compared to Quartus - does this close the gap?
I'm also wondering how congested these input designs are - I don't know much about the benchmarks used. Extremely congested designs can take a much, much longer time to route so an improvement there either in runtime or QoR could be useful.