More

Flux159 · 2026-03-05T20:13:38 1772741618

Amazon checkout is not working for me right now. There's no Amazon specific status page it seems?

AWS has information about their UAE data centers, but haven't seen any confirmation from Amazon itself that amazon.com is having issues.

ada1981 · 2026-03-06T01:45:42 1772761542

This could be the FO portion of the program.

Flux159 · 2026-03-03T18:45:01 1772563501

So is this a minimal upgrade before the M6 Macbook Pros w/ OLED & a redesign later this year?

It doesn't even look like they added cellular as an option with their own C1X chip (getting around the licensing / cost issues since it's their own chip now).

wmf · 2026-03-03T19:22:39 1772565759

I wouldn't assume those are coming this year.

winstonp · 2026-03-03T20:58:17 1772571497

Everybody says they are, with the main point being they can get the M6 Pros onto TSMC's 2nm node and save 3nm capacity for iPhones.

lupajz · 2026-03-04T08:52:25 1772614345

Maybe they just drop the M6 following with Pro & Max next year.

srid · 2026-03-03T19:12:53 1772565173

Yea, I think it is worth waiting for M6 just for OLED alone.

EduardoBautista · 2026-03-03T19:21:45 1772565705

OLED sounds great, but I am worried about burn in. MacBook screens are a bit more static and on longer than iPads and iPhones.

originalvichy · 2026-03-04T21:58:51 1772661531

I use a massive OLED monitor as my workhorse and I’d say money and expectations are better spent on established OLED manufacturers and a large screen vs. a laptop screen. Based on the common job roles HN users have, a large OLED main monitor will probably offer more value than the laptop screen that will probably spend most of its time as a side monitor or just turned off while connected to large monitors. The HDMI 2.1 and other display output gains bring more benefits with pixel output and framerate increases. Just my two cents.

Flux159 · 2026-03-03T18:41:08 1772563268

I'm a bit confused by this branding (never even noticed that there was a 5.2-Instant), it's not a super fast 1000tok/s Cerebras based model which they have for codex-spark, it's just 5.2 w/out the router / "non-thinking" mode?

I feel like openai is going to get right back to where they were pre GPT-5 with a ton of different options and no one knows which model to use for what.

tedsanders · 2026-03-03T19:11:19 1772565079

Yeah, for a while ChatGPT Plus has been powered by two series of models under the hood.

One series is the Instant series, which is faster and more tuned to ChatGPT, but less accurate.

The second series is the Thinking series, which is more accurate and more tuned to professional knowledge work, but slower (because it uses more reasoning tokens).

We'd also prefer to have simple experience with just one option, but picking just one would pull back the pareto frontier for some group of people/preferences. So for now we continue to serve two models, with manual control for people who want to choose and an imperfect auto switcher for people who don't want to be bothered. Could change down the road - we'll see.

(I work at OpenAI.)

vessenes · 2026-03-03T21:44:13 1772574253

By the way, I imagine you know this, but the product split is not obvious, even to my 20-something kids that are Plus subscribers - I saw one of them chatting with the instant model recently and I was like "No!! Never do that!!" and they did not understand they were getting the (I'm sorry to say) much less capable model.

I think it's confusing enough it's a brand harm. I offer no solutions, unfortunately. I guess you could do a little posthoc analysis for plus subscribers on up and determine if they'd benefit from default Thinking mode; that could be done relatively cheaply at low utilization times. But maybe you need this to keep utilization where it's at -- either way, I think it ends up meaning my kids prefer Claude. Which is fine; they wouldn't prefer Haiku if it was the default, but they don't get Haiku, they get Sonnet or Opus.

pants2 · 2026-03-03T21:57:28 1772575048

I agree -- we're on the ChatGPT Enterprise plan at work and every time someone complains about it screwing up a task it turns out they were using the instant model. There needs to be a way to disable it at the bare minimum.

sebmellen · 2026-03-04T05:03:12 1772600592

I mean, they must know this. Imagine how many tokens they're saving.

lifis · 2026-03-03T19:44:40 1772567080

You could perhaps show the "instant" reply right away and provide a button labeled "Think longer and give me a better answer" that starts the thinking model and eventually replaces the answer.

For this to work well, the instant reply must be truly instant and the button must always be visible and at the same position in the screen (i.e. either at the top or bottom, of the answer, scrolling such that it is also at the top or bottom of the screen), and once the thinking answer is displayed, there should be a small icon button to show the previous instant answer.

michaelmrose · 2026-03-03T20:46:51 1772570811

Wouldn't this be 1.5x as expensive?

jimbokun · 2026-03-03T21:33:57 1772573637

Not if the Instant answer is sufficient.

resters · 2026-03-03T21:59:51 1772575191

That's assuming that the instant answer is even directionally correct. A misleading instant answer could pollute the context and lead the thinking model astray.

ssl-3 · 2026-03-03T22:26:25 1772576785

Can the context of the pre-revision, Instant response be simply be discarded -- or forked or branched or [insert appropriate nomenclature here] -- instead of being included as potential poison?

(It seems absurd that to consider that there may be no undo button that the machine can push.)

resters · 2026-03-04T01:14:49 1772586889

I'm sure it could, that is probably how it should work. In many cases it would be fine without that.

Defenestresque · 2026-03-04T01:58:38 1772589518

For those who are unaware, this is exactly what Grok does. The default is an auto mode, then when you ask a question it starts researching (which is visible to the user) and if it's using the expert mode but you don't really need all that jazz, it has a "Quick Answer" button right above the prom entry field, and if it's using a "Quick Answer" mode then it has "Expert" button and the same place, and you are able to toggle between them mid answer and it will adjust the model (or model parameters, I'm not sure how it works under the hood).

It's pretty good with the auto chooser, but I appreciate the manual choice available so in-your-face and especially not having it restart the query completely but rather convert the output to either Quick or Expert.

This is on the Web UI, can't speak for other harnesses. I do find that it's quite good with the citations and has a fairly generous free tier, even on Expert mode. (As for who sits at the top, I am indeed put off by Musk's clear interference in several cases involving Grok, nor do my personal values align with the majority of his, but today's Grok is definitely less MechaHitler and more reliable than it was before.)

Flux159 · 2026-03-03T20:46:05 1772570765

Thanks for clarifying! I guess the default for most users is going to be to use the router / auto switcher which is fine since most people won't change the default.

Just noting that I'm not against differentiation in products, but it gets very confusing for users when there's too many options (in the case of the consumer ChatGPT at least this is still more limited than in pre-GPT 5 days). The issue is that there's differentiation at what I pay monthly (free vs plus vs pro) and also at the model layer - which essentially becomes this matrix of different options / limits per model (and we're not even getting into capabilities).

For someone who uses codex as well, there are 5 models there when I use /model (on Plus plan, spark is only available for Pro plan users), limits also tied to my same consumer ChatGPT plan.

I imagine the model differentiation is only going to get worse as well since with more fine tuned use cases, there will be many different models (ie health care answers, etc.) - is it really on the user to figure out what to use? The only saving grace is that it's not as bad as Intel or AMD cpu naming schemes / cloud provider instance naming, but that's a very low bar.

redox99 · 2026-03-03T22:38:39 1772577519

Auto will never work, because for the exact same prompt sometimes you want a quick answer because it's not something very important to you, and sometimes you want the answer to be as accurate as possible, even if you have to wait 10 minutes.

In my case it would be more useful to have a slider of how much I'm willing to wait. For example instant, or think up to 1 minute, or think up to 15 minutes.

nearbuy · 2026-03-04T01:16:23 1772586983

That's pretty close to what they have. They just named them Instant, Thinking (Standard), and Thinking (Extended), and they're discrete presets instead of a slider.

redox99 · 2026-03-04T04:46:25 1772599585

But the time it takes is too variable. Even standard can sometimes take 15+ minutes.

cj · 2026-03-03T23:04:55 1772579095

They have an "answer now" button that stops the reasoning and starts the reply. Same with Gemini.

redox99 · 2026-03-03T23:10:11 1772579411

Yeah I use that, but it's not really a solution that allows to only have auto. It doesn't help when it chooses Instant instead of Thinking, and it's also much slower than using Instant outright because the Skip button doesn't immediately show, and it's generally slow to restart.

lxgr · 2026-03-03T19:17:20 1772565440

Thank you for confirming!

I've long suspected as much, but I always found the API model name <-> ChatGPT UI selector <-> actual model used correspondence very confusing, and whether I was actually switching models or just some parameters of the harness/model invocation.

> One series is the Instant series, which is faster and more tuned to ChatGPT, but less accurate.

That's putting it mildly. In my experience, the "instant/chat" model is absolute slop tier, while the "thinking" one is genuinely useful and also has a much more palatable tone (even for things not really requiring a lot of thought).

Fortunately, the latter clearly identifies itself with an absurd amout of emoji reminiscent of other early chatbots that shall not be named, so I know how to detect and avoid it.

xiphias2 · 2026-03-03T22:34:09 1772577249

Is there a way to get sticky model selection back, or the reason is that it is just too expensive to serve alternative models?

For coding I love codex-5.3-xhigh, but for non-coding prompts I still far prefer o3 even if it's considered a legacy model.

I can imagine that its higher tool use is too expensive to serve, but as a pro user I would love it to come back.

bananaflag · 2026-03-03T23:11:19 1772579479

Before GPT-5 was launched, and after sama had said they would unify the ordinary and reasoning models, I think we all expected more than an (auto-)switcher, we expected some small innovation (smaller than the ordinary-to-reasoning one, but still a significant one) that would make both kinds of replies be in a way generated by a single model (don't know exactly how, I expected OpenAI to surprise us with something that would feel obvious in retrospect).

merlindru · 2026-03-03T21:18:42 1772572722

but why not have "sane defaults but configurable"?

hide away the extra complexity for everyone. give power users a way to get it back.

dotancohen · 2026-03-03T22:11:50 1772575910

The model doesn't even need to be exposed in the UI. Let the user specify "use model foobar-4" or "use a coding model" or "use a middle-tier attorney model".

VIM does this well: no UI, magic incantations to use features.

discardable_dan · 2026-03-04T11:32:20 1772623940

How's the war effort?

JCharante · 2026-03-04T00:36:36 1772584596

are they really two models? are they available via the API or are they wrappers built on top of models available via the API?

mrcwinn · 2026-03-03T20:20:31 1772569231

Do your fully autonomous offensive weapons and domestic surveillance systems use Instant?

Computer0 · 2026-03-03T20:32:33 1772569953

Not today, but response time would be a lot better if they did.

seejayseesjays · 2026-03-03T19:37:01 1772566621

Forgiveness but while you're here can you look into why the Notion connector in chat doesn't have the capability to write pages but the MCP (which I use via Codex) can? it looks like it's entirely possible, just mostly a missing action in the connector.

idiotsecant · 2026-03-03T20:06:23 1772568383

none granted.

0xbadcafebee · 2026-03-03T20:06:27 1772568387

It's because people like choice and control, and "5.2" vs "5.2 thinking" is confusing. Making them "5.2 instant" and "5.2 thinking" is less confusing to more people. Their competitors already do this (Gemini 3 Fast & Gemini 3 Thinking).

Terretta · 2026-03-03T21:12:58 1772572378

ChatGPT 5.2 Intuitive

ChatGPT 5.2 Ponderous

“I had this dream the other night…” – https://www.youtube.com/watch?v=6gYIbMwswKM

NitpickLawyer · 2026-03-03T19:15:05 1772565305

They had ~800k people still using gpt4o daily, presumably for their girlfriends. They need to address them somehow. Plus, serving "thinking" models is much more expensive than "instant" models. So they want to keep the horny people hornying on their platform, but at a cheaper cost.

mrits · 2026-03-03T20:25:42 1772569542

Are you not vibe coding in girlfriend mode?

kilroy123 · 2026-03-03T20:34:34 1772570074

I can't fathom using LLMs like this. Does ChatGPT actually do this? I thought people who were into this stuff used dedicated apps or Grok?

bananaflag · 2026-03-03T20:54:55 1772571295

https://old.reddit.com/r/ChatGPTNSFW/

Sabinus · 2026-03-04T05:46:33 1772603193

https://www.reddit.com/r/MyBoyfriendIsAI/

TrainedMonkey · 2026-03-03T19:09:41 1772564981

Will need to wait for real benchmarks, but based on OpenAI marketing Instant is their latency optimized offering. For voice interface, you don't actually need high tok/s because speech is slow, time to first token matters much more.

az226 · 2026-03-04T04:30:57 1772598657

Instant is a traditional LLM (non-reasoning). Thinking is a reasoning model. The name instant isn’t “instant” lol.

josalhor · 2026-03-03T20:01:56 1772568116

Reminder that OpenAI serves a lot of customers for free, most of the people I know use the free tier. There is a big limit on thinking queries on free tier, so a decent non thinking model is probably a positive ROI for them.

Flux159 · 2026-02-18T01:34:51 1771378491

This seems like it’s in response to the congressional testimony last week to clarify some things about their remote assistance systems.

It’s interesting that they only have 70 people for this - I can understand the outside the US ones for nighttime assistance and they need to be able to scale for other countries too in the future.

What I’m still wondering is what is limiting the scaling for Waymo - just cars or also the sensor systems? They’ve had their new test vehicles in SF for a while but I still think that most customers only get their Jaguars right now (and still limited on highway driving to specific customers in the Bay Area).

xnx · 2026-02-18T02:24:17 1771381457

> What I’m still wondering is what is limiting the scaling for Waymo

I'm also very curious about this. Probably a mix of many things: training the driver to handle tricky conditions better (e.g. flooded roads), getting more Ohai vehicles imported and configured, configuring the backlog of Jaguar iPace and trucking them out to new markets, mapping roads and non-customer testing in new markets, getting regulatory approval/cooperation in other market (e.g. DC), finding depot space, hiring maintenance team, etc.

Flux159 · 2026-02-16T17:55:32 1771264532

This was announced in early preview a few days ago by Chrome as well: https://developer.chrome.com/blog/webmcp-epp

I think that the github repo's README may be more useful: https://github.com/webmachinelearning/webmcp?tab=readme-ov-f...

Also, the prior implementations may be useful to look at: https://github.com/MiguelsPizza/WebMCP and https://github.com/jasonjmcghee/WebMCP

politelemon · 2026-02-16T18:24:27 1771266267

This GitHub readme was helpful in understanding their motivation, cheers for sharing it.

> Integrating agents into it prevents fragmentation of their service and allows them to keep ownership of their interface, branding and connection with their users

Looking at the contrived examples given, I just don't see how they're achieving this. In fact it looks like creating MCP specific tools will achieve exactly the opposite. There will immediately be two ways to accomplish a thing and this will result in a drift over time as developers need to take into account two ways of interacting with a component on screen. There should be no difference, but there will be.

Having the LLM interpret and understand a page context would be much more in line with assistive technologies. It would require site owners to provide a more useful interface for people in need of assistance.

bastawhiz · 2026-02-16T19:08:18 1771268898

> Having the LLM interpret and understand a page context

The problem is fundamentally that it's difficult to create structured data that's easily presentable to both humans and machines. Consider: ARIA doesn't really help llms. What you're suggesting is much more in line with microformats and schema.org, both of which were essentially complete failures.

LLMs can already read web pages, just not efficiently. It's not an understanding problem, it's a usability problem. You can give a computer a schema and ask it to make valid API calls and it'll do a pretty decent job. You can't tell a blind person or their screen reader to do that. It's a different problem space entirely.

Flux159 · 2026-02-10T21:02:38 1770757358

I'm currently a solo bootstrapped founder, have done short stints in the past - 1 year in 2022, then became cofounder of a funded startup for a year. Now doing it again.

Question is how you stay motivated to keep at it - looks like it took about 4 years before you made similar to your Google salary, did family pressure or external pressure ever impact you? Or is it mainly just keep your eyes on the longer term goal?

I'm also quite lucky that I was aiming for lean-FIRE before I left Facebook, so I have the luxury of being able to keep at it, but sometimes it is demotivating seeing peers / others.

mtlynch · 2026-02-10T21:10:49 1770757849

> Question is how you stay motivated to keep at it - looks like it took about 4 years before you made similar to your Google salary, did family pressure or external pressure ever impact you? Or is it mainly just keep your eyes on the longer term goal?

I found it helpful to go in with low expectations.

I was listening to a lot of podcasts about bootstrapping while I was still at Google in 2017-2018, and even the big success stories usually had 5+ years of failing or succeeding only marginally. So, I went in with the expectation that I'd probably fail for the first 5 years, and so there wasn't that feeling of disappointment from not earning much the first few years.

I also had a lot of lucky conditions that made it easy to take the risk at the time, including no family to support, lots of savings, low expenses.

> I'm also quite lucky that I was aiming for lean-FIRE before I left Facebook, so I have the luxury of being able to keep at it, but sometimes it is demotivating seeing peers / others.

Yeah, honestly I do sometimes think, "Wow, if I'd stayed at Google and kept getting that comp (which was about 50% equity IIRC), that would be a lot of money." But I also am very pleased with my life now, and I know I wouldn't have enjoyed my job nearly as much for the last 8 years had I stayed an employee. And that's a huge amount of my life to not do what I'd like to do.

Flux159 · 2026-02-09T03:48:51 1770608931

A native WebGPU JS engine (no browser needed) https://github.com/mystralengine/mystralnative/

Already have my own JS engine & the basics of three.js and pixi.js 8 working, roadmap to v1.0.0 posted in github issues. Aiming to show it to folks at GDC in March.

Flux159 · 2026-01-31T01:31:43 1769823103

So in theory it should be possible, but it might require customizing the Dawn or wgpu-native builds if they don't support it (this is providing the JS bindings / wrapper around those two implementations of wgpu.h). But I've already added a special C++ method to handle draco compression natively, adding some mystral native only methods is not out of the question (however, I would want to ensure that usage of those via JS is always feature flagged so that it doesn't break when run on web).

Did you write your WebGPU chessboard using the raw JS APIs? Ideally it should work, but I just fixed up some missing APIs to get Three.js working in v0.1.0, so if there are issues, then please open up an issue on github - will try to get it working so we close any gaps.

danjl · 2026-01-31T17:21:06 1769880066

Here's a dawn implementation with support for ray tracing that was implemented a number of years ago but never integrated into browsers. Perhaps it will help?

https://github.com/maierfelix/dawn-ray-tracing

Yes, chessboard3d.app is written with raw JS APIs and raw WebGPU. It does use the rapier physics library, which uses WASM, which might be an issue? It implements its own ray tracing but would probably run 10x faster with hardware ray tracing support.

I think you'd get a lot of attention if you had hardware ray tracing, since that's only currently available in DirectX 12 and Vulkan, requiring implementation in native desktop platforms. FWIW, if the path looks feasible, I would be interested in contributing.

Flux159 · 2026-01-31T19:08:17 1769886497

WASM shouldn't be an issue since the draco decoder uses it - but it may only work with V8 (for quickjs builds it wouldn't work, but the default builds use V8+dawn). Obviously with an alpha runtime, there may be bugs.

I think it would be super cool to have some sort of extension before WebGPU (web) has it. I was taking a look at the prior example & it seems like there's good ongoing discussion linked here about it: https://github.com/gpuweb/gpuweb/issues/535. Also I believe that Metal has hardware ray tracing support now too?

Re: Implementation, a few options exist - a separate Dawn fork with RT is one path (though Dawn builds are slow, 1-2 hours on CI). Another approach would be exposing custom native bindings directly from MystralNative alongside the WebGPU APIs - that might make iteration much faster for testing feasibility. The JS API would need to be feature-flagged so the same code gracefully falls back when running on web (did this for a native draco impl too that avoids having to load wasm: https://mystralengine.github.io/mystralnative/docs/api/nativ...).

Flux159 · 2026-01-30T11:11:00 1769771460

Followup comment about Apple disallowing JIT - will need to confirm if JSC is allowed to JIT or only inside of a webview. I was able to get JSC + wgpu-native rendering in an iOS build, but would need to confirm if it can pass app review.

There's 2 other performance things that you can do by controlling the runtime though - add special perf methods (which I did for draco decoding - there is currently one __mystralNativeDecodeDracoAsync API that is non standard), but the docs clearly lay out that you should feature gate it if you're going to use it so you don't break web builds: https://mystralengine.github.io/mystralnative/docs/api/nativ...

The other thing is more experimental - writing an AOT compiler for a subset of Typescript to convert it into C++ then just compile your code ("MystralScript") - this would be similar to Unity's C# AOT compiler and kinda be it's own separate project, but there is some prior work with porffor, AssemblyScript, and Static Hermes here, so it's not completely just a research project.

whizzter · 2026-01-30T13:16:58 1769779018

Is AssemblyScript good for games though? last I checked it lacks too much features for game-code coming directly from TS but might be better now? No idea how well static hermes behaves today (but probably far better due to RN heritage).

I've been down the TS->C++ road a few times myself and the big issue often comes up with how "strict" you can keep your TS code for real-life games as well as how slow/messy the official TS compiler has been (and real-life taking time from efforts).

It's better now, but I think one should probably directly target the GO port of the TS compiler (both for performance and go being a slightly stricter language probably better suited for compilers).

I guess, the point is that the TS->C++ compilation thing is potentially a rabbit-hole, theoretically not too bad, but TS has moved quickly and been hard to keep up with without using the official compiler, and even then a "game-oriented" typescript mode wants to have a slightly different semantic model from the official one so you need either a mapping over the regular type-inference engine, a separate on or a parallell one.

Mapping regular TS to "game-variants", the biggest issue is how to handle numbers efficiently, even if you go full-double there is a need to have conversion-point checking everywhere doubles go into unions with any other type (meaning you need boxing or a "fatter" union struct). And that's not even accounting for any vector-type accelerations.

Flux159 · 2026-01-30T19:27:13 1769801233

AssemblyScript was just mentioned as some prior work, I don't think that AssemblyScript would work as is for games.

I realize the major issues with TS->C++ though (or any language to C++, Facebook has prior work converting php to C++ https://en.wikipedia.org/wiki/HipHop_for_PHP that was eventually deprecated in favor of HHVM). I think that iteratively improving the JS engine (Mystral.js the one that is not open source yet but is why MystralNative exists) to work with the compiler would be the first step and ensuring that games and examples built on top with a subset of TS is a starting point here. I don't think that the goal for MystralScript should be to support Three.js or any other engine to begin with as that would end up going down the same compatibility pits that hiphop did.

Being able to update the entire stack here is actually very useful - in theory parts of mystral.js could just be embedded into mystralnative (separate build flags, probably not a standard build) avoiding any TS->C++ compilation for core engine work & then ensuring that games built on top are using the strict subset of TS that does work well with the AOT compilation system. One option for numbers is actually using comment annotations (similar to how JSDoc types work for typescript compiler, specifically using annotations in comments to make sure that the web builds don't change).

Re: TS compiler - I do have some basics started here and I am already seeing that tests are pretty slow. I don't think that the tsgo compiler has a similar API though for parsing & emitters right now, so as much as I would like to switch to it (I have for my web projects & the speed is awesome), I don't think I can yet until the API work is clarified: https://github.com/microsoft/typescript-go/discussions/455

Flux159 · 2026-01-30T11:02:11 1769770931

I remember reading about Ejecta a long time ago! I had completely forgotten about it, but it is similar! The funny thing is to support UI elements, I had to also support canvas2d through Skia (although not 100% yet), so maybe impact could even work at some point (would require extensive testing obviously).