Credit where it’s due: doing live demos is hard. Yesterday didn’t feel staged—it looked like the classic “last-minute tweak, unexpected break.” Most builders have been there. I certainly have (I once spent 6 hours at a hackathon and broke the Flask server keying in a last minute change on the steps of the stage before going on).
One of the demos was printing a thing out, but the processor was hopelessly too slow to perform the actual print job. So they hand unrolled all the code to get it down from something like a 30 minute print job to a 30 second print job.
I think at this point it should be expected that every publicly facing demo (and most internal ones) are staged.
The CEO of Nokia had to demo their latest handset one time on stage at whatever that big world cellphone expo is each year.
My biz partner and I wrote the demo that ran live on the handset (mostly a wrapper around a webview), but ran into issues getting it onto the servers for the final demo, so the whole thing was running off a janky old PC stuffed in a closet in my buddy's home office on his 2Mbit connection. With us sweating like pigs as we watched.
As much as I hate Meta, I have to admit that live demos are hard, and if they go wrong we should have a little more grace towards the folks that do them.
I would not want to live in a world where everything is pre-recorded/digitally altered.
The difference between this demo and the legendary demos of the past is that this time we are already being told AI is revolutionary tech. And THEN the demo fails.
It used to be the demo was the reveal of the revolutionary tech. Failure was forgivable. Meta's failure is just sad and kind of funny.
It's less about the failure, and more about the person selling the product, we don't like him, or his company, and that's why there is no sympathy for him and he knows that.
When it went bad he could instantly smell blood in the water, his inner voice said, "they know I'm a fraud, they're going to love this, and I'm fucked". That is why it went the way it did.
If it was a more humble, honest, generous person, maybe Woz, we know he would handle it with a lot more grace, we know he is the kind of person who would be 100x less likely to be in this situation (because he understands tech) and we'd be much more forgiving.
Despite the Reddit post's title, I don't think there's any reason to believe the AI was a recording or otherwise cheated. (Why would they record two slightly different voice lines for adding the pear?) It just really thought he'd combined the base ingredients.
That's even worse because it would mean that it wasn't the scripted recording that failed, it means the AI itself sucks and can't tell that the bowl is empty and nothing was combined. Either this was the failure of a recorded demo that was faked to hide how bad the AI is, or it accurately demonstrated that the AI itself is a failure. Either way it's not a good look.
My layperson interpretation of this particular error was that the AI model probably came up with the initial recipe response in full, but when the audio of that response was cut off because the user interrupted it, the model wasn't given any context of where it was interrupted so it didn't understand that the user hadn't heard the first part of the recipe.
I assume the responses from that point onwards didn't take the video input into account, and the model just assumes the user has completed the first step based on the conversation history. I don't know how these 'live' ai sessions things work but based on the existing openai/gemini live ai chat products it seems to me most of the time the model will immediately comment on the video when the 'live' chat starts but for the rest of the conversation it works using TTS+STT unless the user asks the AI to consider the visual input.
I guess if you have enough experience with these live AI sessions you can probably see why it's going wrong and steer it back in the right direction with more explicit instructions but that wouldn't look very slick in a developer keynote. I think in reality this feature could still be pretty useful as long as you aren't expecting it to be as smooth as talking to a real person
It seems extremely likely that they took the context awareness out of the actual demo and had the AI respond to pre defined states and then even that failed.
The AI analyzing the situation is wayyy out of scope here
"unpredictable" and "doesn't work" are different things. As a user, I know it's not deterministic and I can live with "unpredictable" results as long as it still makes sense, but I won't buy something that works 50% of the time.
It was reading step 2 and he was trying to get it to do step 1.
He had not yet combined the ingredients. The way he kept repeating his phrasing it seems likely that “what do we do first” was a hardcoded cheat phrase to get it to say a specific line. Which it got wrong.