Hacker Newsnew | past | comments | ask | show | jobs | submit | maebert's commentslogin

Tangentially related: one of my all-time favourite neuroscience papers by Iriki et al. [1] showed that the mouse pointer becomes part of the body schema in a real, measurable way.

Basically there are neurons whose receptive field (ie the subset of the outside world that causes the neuron to fire) is "everything a monkey can reach". Flash a light in that area, that neuron fires, flash it beyond that area, neuron stays silent.

Now if you give the monkey a rake, the neuron's receptive field immediately grows to encompass the space it can now reach with the rake too: the rake becomes part of the body schema, not part of the outside world [2].

But if the rake instead is just a stick but it controls a mouse cursor on the screen, the _area of the screen_ the monkey can interact with with the cursor becomes part of the receptive field of that neuron too. That suggests that the cursor itself becomes part of the body schema.

TL;DR don't mess with people's mouse cursors, it's like cutting their limbs off.

[1]: https://pubmed.ncbi.nlm.nih.gov/15588812/ [2]: A wild Heidegger appears and talks about Vorhandensein and Zuhandensein.


> TL;DR don't mess with people's mouse cursors, it's like cutting their limbs off.

“Don’t mess” is a very broad range that includes things like removing the mouse cursor at nine PM. Of course, no one should do that.

But a narrower conclusion “let the mouse warp at certain predictable cases” doesn’t contradict your thesis.


Hyperspell | YC F25 | San Francisco, ONSITE | $150k-220k + generous equity and benefits | hyperspell.com

I'm Manu, founder of Hyperspell. 14 Years ago I found my first job in tech through Who's Hiring. I take this job board seriously :)

Hyperspell is building Context & Memory for AI agents. We're at the intersection of "future of AI and agentic computing" and "this actually works". We're not just changing how our customers use AI in their products, but are constantly iterating on how we use AI to build Hyperspell, and what it means to be a software company in the decades to come.

We're hiring product, AI, and Platform Engineers in SF: https://jobs.ashbyhq.com/hyperspell

Comment or reach out to founders@hyperspell.com with any questions!


> We're not just changing how our customers use AI in their products, but are constantly iterating on how we use AI to build Hyperspell, and what it means to be a software company in the decades to come.

i feel like this needs a joking nod to the complaints of llms constantly responding with not just x, but y


The whole artificial scarcity Anthropic created around Mythos / Glasswing is quite brilliant to be honest (I’m Not saying ethical, just brilliant). The commercial gains are one side of course. But consider this:

Gets labelled supply chain risk by the pentagon. Hypes up what they claim to be the most advanced hacking tool on the planet. This puts the US government into a loose / loose position. Either deny the NSA access to it, or be called out on their bluff.


> The whole artificial scarcity Anthropic created around Mythos / Glasswing is quite brilliant to be honest

Isn’t that just the same strategy OpenAI has used over and over? Sam Altman is always “OMG, the new version of ChatGPT is so scary and dangerous”, but then releases it anyway (tells you a lot about his values—or lack thereof) and it’s more of the same. Pretty sure Aesop had a fable about that. “The CEO who cried ‘what we’ve made is too dangerous’”, or something.

https://en.wikipedia.org/wiki/The_Boy_Who_Cried_Wolf


Right, but in Aesop’s fable, the wolf did eventually come. It’s asymmetric, because in this case the wolf is not coming for the boy, it’s coming for everybody else


The boy isn't crying wolf strictly to save himself. He does it to get the attention of the town, knowing they'll come to the aid of the livestock he's been tasked with watching. Yes, their aid is primarily to save the boy, but the danger is still to the larger community rather than isolated to the lookout.


Or, the wolf is just a squeeky toy.


They way they've published hashes of the bugs it has found so that once those bugs are fixed they can responsibly disclose them while also proving that they weren't lying... that displays a willingness to dabble in evidence which is far beyond anything OpenAI has done to support their claims.


This. I see much cheap naysaying without referenece to the vuln hashes. If it is smoke and mirrors, then the naysayers should loudly shout down the specific hashes and when they get revealed, or don't, then they will have done a great service to dissuading fake claims to world changing tech.


It proves that at least some of the bugs exist, not that you need “new thing” to find them or that they even used “new thing” to find them.

There was a story the other day about others finding the same bugs with qwen.


Certainly. As evidence goes it's a tremendously limited strategy. But the bar for such things is pretty low right now, so it doesn't take much to outdo the others by quite a lot.


>Sam Altman is always “OMG, the new version of ChatGPT is so scary and dangerous”, but then releases it anyway

One of the many reasons nobody should give Scam Altman their money. It's continually infuriating that this serial grifter is in such a position of power.


It’s absolutely a move from Sam Altman’s playbook but it got traction because it’s not coming from Sam Altman and the US rejection helped to give credibility to fear Mythos’ findings.


It was from GPT-2 and Dario was part of the developers of that model while he was working in OpenAI, not Sam Altman, it's his playbook


> It was from GPT-2

Prior to the released of GPT-5, Sam said he was scared of it and compared it to the Manhattan Project.


Not just Altman. Buffett said it also, more generally.

https://youtu.be/vZlMWF6iFZg


Not just part of the developers, but rather "led the development of large language models like GPT-2 and GPT-3" as per his website.

https://darioamodei.com/


This is pretty much correct, but Mustafa Suleyman has probably been doing it longer.


Anthropic has not in fact released it, and it does in fact appear to be that dangerous, judging by the flood of vulnerability reports seen by e.g. Daniel Stenberg.

Certainly it’s a strategy OpenAI has used before, and when they did so it was a lie. Altman’s dishonesty does not mean it can never be true, however.


The flood of reports that open source projects like curl, Linux and Chromium are getting are presumably due to public models like Open 4.6 that released earlier this year, and not models with limited availability.


How many months till they release a better model than mythos to general audience?

Gpt 2 wasn't released fully because OpenAI deemed it too dangerous, rings a bell? https://openai.com/index/better-language-models/#sample1


A few months of restricting access to people they think will actually fix problems is a big deal. Obviously only an idiot would think it could or should be kept under wraps forever.


> judging by the flood of vulnerability reports seen by e.g. Daniel Stenberg

Maybe I've missed anything, but what Stenberg been complaining about so far been the wave of sloppy reports, seemingly reported by/mainly by AIs. Has that ratio somehow changed recently to mainly be good reports with real vulnerabilities?


Some relevant links:

[1] https://www.npr.org/2026/04/11/nx-s1-5778508/anthropic-proje...

> Improvement in AI models' capabilities became noticeable early 2026, said Daniel Stenberg.

> He estimates that about 1 in 10 of the reports are security vulnerabilities, the rest are mostly real bugs. Just three months into 2026, the cURL team Stenberg leads has found and fixed more vulnerabilities than each of the previous two years.

[2] https://www.linkedin.com/posts/danielstenberg_curl-activity-...

> The new #curl, AI, security reality shown with some graphs. Part of my work-in-progress presentation at foss-north on April 28.


He has changed his opinion completely. Yes, the ratio has turned.


Yes:

> The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a ... plain security report tsunami. Less slop but lots of reports. Many of them really good.

> I'm spending hours per day on this now. It's intense.

https://mastodon.social/@bagder/116336957584445742


Those vulnerabilities were found by open models as well.


Partly true. I think the consensus was it wasn't comparable because Mythos swept the entire codebase and found the vulnerabilities, whereas the open models were told where to look for said vulnerabilities.

https://news.ycombinator.com/item?id=47732337


Not really. The models were pointed specifically at the location of the vulnerability and given some extra guidance. That's an easier problem than simply being pointed at the entire code base.


Surely the Anthropic model also only looked at one chunk of code at a time. Cannot fit the entire code base into context. So supplying an identical chunk size (per file, function, whatever) and seeing if the open source model can find anything seems fair. Deliberately prompting with the problem is not.


> This puts the US government into a loose / loose position.

You might even call it... a tight spot


Side note, how did the word "lose" become "loose"? I've seen this so many times on HN.


It didn't, but the advent of spellcheck and autocorrect has made everyone completely give up on proper grammar or word selection as long as no squiggly line appears.


Maybe that’s part of it, but I’ve also noticed autocorrect on my devices often correcting incorrectly. As in, I type the word correctly and it decides “oh, surely you meant this other similarly spelled word” and changes it. Sometimes I don’t notice until after sending the message.


I use MS SwiftKey on my android phone and it will often autocorrect my correctly spelled, correctly used, words, to words that probably don't exist in any language (recently it corrected "blow" to "blpw").

I have French installed on my keyboard as well so sometimes it will randomly correct English words to French words (inconsistently, but at least they're words), but blpw is not a word in either of those languages.

Unfortunately, I think me typing blpw three times has officially added it to my dictionary :)


Don't worry it's no better on iOS, where I too have a English+French QWERTY setup, and where it too frequently decides to "helpfully" correct using an English dictionary several words into a unambiguously French sentence; or the other way around depending on wind direction and age of the captain.

Even more damning is that there seems to be three independent layers to the feature ("three suggestions" area above keyboard, autocorrect-as-you-type, correction popup as you touch a word) and neither agree with each other about which language it should be using.


Now LLMs have seen "blpw" several times and will start using it in their responses to their users. Next: Oxford dictionary word of the year 2026: "blpw".


That defiantly has something to do with it


You need to spend more time in the libarry.



Could also be non-native speakers .. Even as a former grammar nazi, now that English isn't my daily driver language I find myself making basic mistakes .. (two, too, to / its, it's / etc.)


Having grown up around immigrants and other folks who learned English as a second language, I always attributed "loose" for being a signal that perhaps English isn't the writer's first language.

I think what you say is partly true too, but it's not a new phenomenon. Some examples

- awful used to mean "awe-inspiring" https://en.wiktionary.org/wiki/awful

- you used to be the plural/formal second person pronoun with thou being the informal form https://en.wikipedia.org/wiki/You

- prior to the printing press English didn't have any standardized spelling at all https://www.dictionary.com/articles/printing-press-frozen-sp...

Language evolves. The English we learned in grammar school is likely not going to be the same English our kids or grandkids learn. At the end of the day, written communication has a single purpose — to communicate. If I can understand what the author is trying to say, then the author achieved their goal. That being said, I wish my mom did use spell check or autocorrect because her messages often require a degree in linguistics to decipher, but because of typos, not spelling. Maybe she'll influence the next evolution in typed communication :)

Edit - formatting


Because your pronounce them backwards.

"Loose" is a short word that ends sharply, but "lose" is a long word that slowly peters out.

They should be the other way around imo.


If we're allowed to make modifications here then it should really be lose => looze and loose => luce


I think that would make "loosely" not work out. Lucely/lucly catch the hard C there. I'm good with loozing/loozer, looks kind of funny though.


I would not pronounce lucely with a hard C


Lucely absolutely does not catch the hard c. Surely there is no word in the English language where "ce" has a hard c, only loanwords like celt.


Lucezly


One more step, and we're in Poland.


Fun fact — English did not have formalized spelling prior to the printing press

https://www.dictionary.com/articles/printing-press-frozen-sp...

So, technically we are allowed to make modifications! We just can't expect others to adhere to our modifications :)


Luce is already a word in English, if a little obscure


This was also the way I felt before I was introduced to "the magic e" (spoiler: it still doesn't make any sense)

https://www.academysimple.com/magic-e-words/


Wow, "magic e" just transported me back to primary school. And I had a little heart flutter fearing that I wouldn't be able to remember/explain it today.


Now that you frame it that way, I'm surprised "lose" didn't evolve to be pronounced like "Lowe's"


I hate discussions like these because then I start reading words in weird ways and then I look at words as a random jumble of letters that don't even seem like words anymore. Is that just me? :)


Some people pay a lot of money to achieve that state of mind.


Loose rhymes with moose, noose, caboose...


Exactly, and we all know those are pronounced mooze, nooze, and cabooze.


I think you mean mose, nose, cabose.


Since English has a glut of loaner words, I'd assume the two words just originate from different languages.


I’m guessing most cases of loose/lose switch happen when English isn’t someone’s first language.


In my experience, this mistake happens all the time for native English speakers born in the US.


Indeed, but other languages have been around forever whereas I've seen this particular misspelling a ton in the last year and rarely before that.


I've noticed it for much longer than a year ago, it's been a thing for awhile now. Especially online, which may lend credence to the idea of it occurring most with those who didn't grow up writing English, but even with native writers it seems to be occurring more and more.


I haven’t noticed the same trend.


Search the word "loose" in recent HN comments, it's become quite common.

> all he'll breaks loose (a doubly amusing one): https://news.ycombinator.com/item?id=47835177

> So Ukraine should not necessary win, it should mainly bleed Russia and not loose. https://news.ycombinator.com/item?id=47827489

> They are de-risking by spending more, which is a loose-loose for the customers. https://news.ycombinator.com/item?id=47826823

Plus this thread, and that's just in the last 24 hours!


You may be completely right, but these examples are pretty meaningless without context, like what is the rate of lose/loose confusion per x words over time.


Exactly the same way that the `cancelled` of my youth became `canceled`. By being misspelled so often that the misspelling won.

In this case, it's not clear who wins yet — "lose" may loose, or mount a comeback, resulting in "loose" being the one to lose.


I've said it a couple times in the past: That's so cringe!


It doesn't make sense to have "lose" pronounced as it is. We have rose, pose, dose, nose all pronounced with ō. And then you have lose pronounced as loo͞z. It feels natural to put two O's in there when you write it.


English is not a rules-based language, esp wrt pronunciation. Words can be pronounced as anything.


When I discovered the pronunciation of Houston, TX and Houston, NY... my mind was blown


This is true, but if the goal is to be understood, it's in the speaker's best interest to pronounce words in a way they'll best be understood. So I think even if the language itself lacks formal rules, we as a society of communicators should align on some loose set of rules.


I am at a loss; should we change the way “lose” is pronounced or the way it is written? I feel like if we just add an “o”, connections with other derivative words may be lost or those need to change too.

Also, the “s” in “loose” (the actual word) should be pronounced as “z” sound, as it lies between 2 vowels. Should we also change that? Should we change the way it is pronounced or the way it is written? Maybe if we change this to “loosse” we can free space for “lose” to add an “o”?


As much as I'd like it to be the case, there's too much to unlearn, and I think this would be Pandoras box. Too many weird words and spellings to change.

This language comedian does a bunch of humorous sketches about how many languages make no sense! But in particular, this video tackles false "rhymes" like allow and shallow.

https://youtube.com/shorts/6ZE5zMnBwVc?si=gBiwe9QjT-Co-MVu


... at the mental cost of the reader.

Do you not want people to read what you write?


How does writing “lose” as “loose” help? For one, not all learn English spoken-first.


I always assume not everyone is an English speaker and let it go.


Ha. Non-native speaker here although you wouldn’t be able to tell what talking to me, until you hear me confuse when to use this vs that, and lose vs loose. Some things my brain just refuses to remember.


Native English speaker here and my linguist wife constantly has to remind me that I use many propositions incorrectly, because my parents were non-native speakers and in their native language (Behasa Melayu), those propositions were the same words.

For some reason I can't think of those propositions at the moment, but it's definitely prevalent when I'm speaking French and use the wrong proposition, only because I'd have used the wrong proposition in English.


Understandable. Most wives don't like it when their husbands proposition others.


I try to let it go, but this is my pet peeve.


It's fine, nothing to see. Just focus on the intended meaning not the underlying delivery. Mere words don't really impact communication. Right?


u r crct


And let’s not get started on it’s vs. its-—a distinction that now seems irretrievably nerfed


people are from many places


In all of those places loose means something that isn't tight and lose something that you've displaced.

I think it would be correct to say people display varying command of the English language, which to me has never been a problem - as long as I can understand what you mean, it's all fine.


Ok. This is was either brilliant or I did not wake up yet.


This is not the first time Pete Hegseth charged into a bar, started swinging his fists and screaming "don't you know who my father is", only to find his junk in a vise with no graceful way get it out.


For some reason I thought you were doing a setup for a joke...

"The President of the US, the Secretary of Defense, Iranian Prime Minister walk into a bar..."


Hegseth gets drunk, Mojtaba preaches the benefits of abstaining from alcohol, and Trump trips because he didn't see the bar


Mythos is most certainly not hype. I think it might be the agent with most agency as of today (ability to get really difficult shit done on its own). I believe that it most certainly is not hype. A realization just struck me that guarding the model weights (which are probably in the realm of a few TB) should be of utmost importance. Essentially - having access to them and a small NVIDIA cluster is all it takes for anybody to start using Mythos for themselves.

Barring any limitations of my understanding, the Mythos model weights are probably in the realm of a few TB. Any actor with access to the weights + a single beefy NVIDIA cluster and a few intelligent folks is all it takes to gain access to Mythos.

Cost of infra < $5 million (guesstimate). Imagine someone pulling that off by gaining access to the weights - which would be a monumental challenge, but likely less complicated than re-acquiring enriched substances from the gulf nation under attack right now. It would be the heist of the century.


> not hype

Proceeds to write the hypiest comment possible. No substantial claims of why the model is not hype, just how dangerous it would be if the weights leaked and how cheap it would be for anyone to just start using it for EVIL if it ever did.


>pulling that off by gaining access to the weights

This was a point in the AI 2027 videos you see on youtube. That model weights would be a subject of active attack by nation states and that governments would start requiring AI companies to treat them like munitions when securing them.


I'm a crypto wars veteran, discovering the internet with the nerfed 40-bit version of Netscape


It is pretty obvious from the token speed that opus now is sonnet or haiku size a few versions ago. So Mythos is likely what was called opus. They dont tell us the size but they did co firm the training run for Mythos was under the 10^26 flops reporting requirement.

In an alternate universe, opus 4.7 is sonnet 5, and Mythos is released as Opus. Can you imagine how much praise would be heaped on Anthropic if it opus 4.7 was < half the price it is now?


> Glasswing

Fun fact, the model isn't quite the important part for Glasswing, someone took the ideas, and made their own open alternative, you can swap out models and find issues in code using clearwing. I haven't had a chance to personally test it, but it makes a lot of sense to me.

https://github.com/Lazarus-AI/clearwing


They created the model specifically to play this game.


“Show me the incentives and I will show you the outcomes.” Charlie Munger


They said they designed it to be a better coding model. Something that has long been true: better software engineers are better vulnerability hunters as well. I think we are seeing that play out with Mythos.


Plot twist it gets acquired by the US govt.


If this happens it's not going to take the form of them getting "acquired", they're going to end up forced to become a defense contractor like Lockheed Martin or Raytheon where their primary customer is the USG and all of their sales require governmental approval.


And the absolute last group the government would ever approve access to would be "We the People".

I know it's not realistic at this point, but I really hope the Chinese labs will release models that run local and are on par with the abilities of frontier models. That is, I hope the idea of frontier models goes away. Because if not, what we're looking at is a seriously bleak outlook with respect to economic freedom for anyone outside the 0.1%. We may even be looking at out and out lack of economic viability for vast segments of the population.


Our best bet is competition stays healthy, and the model providers keep "releasing" their best to stay ahead. Even then, or even if every human was given equal tokens and access, we'll see crazy inequalities just because of how effective the tool is. The smart get smarter and more effective while the dumb will be swallowing down infinite memes and letting the LLMs do all their thinking.


I don't think anyone forced ockheed Martin or Raytheon to become defense contractors.


it's a taste of what's to come, the anointed class with access to the latest and greatest model in exchange for favours or $$$$, and an underclass making do with the hobbled toy models.


It's like opening up an exclusive night club. Everyone is talking about it and wants in, even though most know nothing about what's actually inside.


Worth noting that Trump was one who labeled them a supply chain risk for the horrible crime of setting really basic guardrails around usage. (And it's "lose" btw)


"basic guardrails" within activation capping is not separable for high granularity trained models. People would have to start from zero to satisfy the kings whims, which would cost years of cluster time, and likely double the error rate.

Governments are difficult customers for software firms, as most military folks get an obscure exemption from copyright law at work. Anthropic finding other revenue sources is a good choice, if and only if the product has actual utility (search is an area LLM are good at.) =3


turns out it was spelled "lusage" the whole time


Governments are sovereign: they tell people what to do (by making laws, by exercising a monopoly of violence, etc), and nobody tells them what to do. Governments also fight wars, which means lives depend on the government's ability to command.

Private companies make products. When those products were plowshares or swords or missiles, the company didn't really have a say over how they were used, and could be compelled by the government to supply them. Now that new cloud and AI products that increase government command abilities live on servers controlled by private companies, private companies think they can tell government what to do and not do. No government will accept that, because the essence of government is autocratic sovereignty: the sovereign commands and is not commanded.


In American law, companies have the choice of whether or not to do business with the government, outside of a few corner cases. There’s a process for forcing them, but it can’t just be because the leader says so.

In this particular case Anthropic had a contract stating what the military could and could not use their models for. The military broke that contract. Anthropic declined to sign a revised one.

This is within their rights, and more to the point, the government should absolutely not be allowed to unilaterally alter contracts they’ve already signed!

Predictability is the whole point. Undermining it is how you destroy your own economy.


That is allegedly not what happened. Anthropic’s CEO was happy to grant waivers on a case by case basis.

The problem is the branches of the government that Anthropic was doing business with found it infeasible to do this.

They had another problem. If one of their contractors used Claude to engineer solutions contrary to Anthropic’s “manifesto” would Claude poison pill the code?

Basically Anthropic wanted the angels halo and the devils horns and the govt said pick one.


> That is allegedly not what happened. Anthropic’s CEO was happy to grant waivers on a case by case basis. The problem is the branches of the government that Anthropic was doing business with found it infeasible to do this.

That's not what the presidential announcement blacklisting Anthropic said. It said they're being punished for trying to require that the military follow their terms of service.


That’s the other pov (from the govt angle) - https://www.businessinsider.com/pentagon-official-details-ho...

The media is usually flush with defending Anthropic. And yes - the supply chain risk label is too broad. But there is another side to the story and Anthropic isn’t an “innocent” as made out to be.


I've heard this POV before, I just re-read it again, and I genuinely do not understand which part of it you think shows Anthropic is anything but innocent. To me it seems pretty clear: Emil Michael heard that Anthropic was asking questions about how their system was used, and he thinks that attitude is an unacceptable security risk. He won't accept the use of systems that were developed based on "their constitution, their culture, their people" or "their own policy preferences". Anyone who would ask such questions might sabotage military operations if they don't like the answers, he argues, and I believe that he genuinely believes this.

So he'll only accept systems developed by people who understand, as Sam Altman promised to, that the US military is not to be questioned.


My impression was that Dario was happy to grant case by case exceptions. But Emil did not want that. I mean why setup claude at DoW where the goal is surveillance and targeting (possibly autonomous).


>happy to grant case by case exceptions

Which makes more sense, the world isn't a black and white place with clear abstractions.


Sure, they have a "choice", except that no one turns done the kind of money the government has to offer, and if the company is public they are legally obligated to increase shareholder value.


> the essence of government is autocratic sovereignty

*was

Democracy was and is radical for putting the common people in charge of the government. The right to petition for redress of grievances is literally in the first amendment. Government is a social contract, enforced with state violence on one end and mob violence on the other.

If you want to return to autocratic rule, I hear North Korea is lovely this time of year.


More importantly in the United States we have certain rights which cannot be abridged, even by a majority of the electorate though the government.


Except the politicians just ask their rich friends to do the things they aren't allowed to do and then act like there's nothing they can do.


And that makes autocracy better somehow? Democracies are designed to evolve. If government corruption is a problem, we as citizens have the power to change that. Laws can be passed to add controls, fund enforcement, require transparency.

Write to your reps and demand it. Call their offices and rattle their gates. If they don’t make it happen, vote in someone who will.


I never said autocracy is better. I hope you are right. I do vote for people who at least appear to lean towards my ideal. The problem is being surrounded by people who are indoctrinated to vote the opposite, and being let down by the few who do win. There do appear to be pockets where good things are happening on the local level, but at the national level it's a shit show.

I never said autocracy is better. We already have laws against a great number of things that are currently happening, but they are either enforced selectively or not at all. For many of what I would consider the most aggregious violations, even when punishments are handed down they are so weak that they do nothing to deter the crime. Companies literally figure the fines for laws they know they are breaking into their budget and people keep pretending like the system works.


Not only that, but I feel there's a lot to validity of this meme from reddit: https://i.redd.it/jxfayl16q5wg1.jpeg .

Maybe not "completely out", but at least not having enough available capacity to release a model way bigger than Opus publicly.


'Anthropic is / isn't lying about Mytho's capabilities' is the less interesting conversation.

The more interesting one is:

   1. Assuming even incremental AI coding intelligence improvements
   2. Assuming increased AI coding intelligence enables it to uncover new zero day bugs in existing software
   3. Then open source vs closed source and security/patch timelines will all need to fundamentally change
Whether or not Mythos qualifies as (1), as long as (2) is true then it seems there will eventually be a model with improvements, which leads to (3) anyway.

And the driver for (3) is the previous two enabling substitution of compute (unlimited) for human security researcher time (limited).

Which begs questions about whether closed source will provide any protection (it doesn't appear so, given how able AI tools already are at disassembly?), whether model rollouts now need to have a responsible disclosure time built in before public release, and how geopolitics plays into this (is Mythos access being offered to the Chinese government?).

It'll be curious what happens when OpenAI ships their equivalent coding model upgrade... especially if they YOLO the release without any responsible disclosure periods.


> Which begs questions about whether closed source will provide any protection (it doesn't appear so, given how able AI tools already are at disassembly?)

Disassembly implies that you're still distributing binaries, which isn't the case for web-based services. Of course, these models can still likely find vulnerabilities in closed-source websites, but probably not to the same degree, especially if you're trying to minimize your dependency footprint.


You're still at the point that any known or unknown disclosure of your binary puts you at risk. At best it's a false sense of security.


> it doesn't appear so, given how able AI tools already are at disassembly?

If that's your concern, shareware industry developed tools to obfuscate assembly even from the most brilliant hackers.


That's not true, they did do obfuscation but the main sneaky thing they did was to make hackers think that they had found all of the checks, and then hide checks that would only trigger half way through the game. That kind of obfuscation is also not relevant to security vulnerabilities.

AI is already superhuman at reading and understanding assembly and decompilation output, especially for obfuscated binaries. I have tried giving the same binary with and without heavy control flow obfuscation to the same model, and it was able to understand the obfuscated one just fine.


I'm really tired of these claims that Mythos is "nothing by PR hype". It should be at this point eminently clear that the people working at Anthropic believe the things they say about their models. And for mythos in particular, at this point there are far too many people outside of Anthropic who have seen it and/or the vulnerabilities it has discovered for "it's nothing but hype" be anything close to a sensible position. I'm not saying we should blindly believe them; they have often used more caution than was entirely warranted (this is, in my opinion, a good thing) but the idea that all of this around Mythos and glasswing is nothing but marketing hype is nonsense. Might a disinterested 3rd party decide that they think the fire is smaller than Anthropic's smoke warranted? Yes that's possible. But the idea that it's all smoke and no fire at this point deserves no resepect whatsoever.


To be clear I’m not claiming that Mythos is _nothing_ but PR hype, merely that Anthropic is playing its cards really well, which is a claim independent of actual capabilities of their latest model.


I'm similarly tired of people writing impassioned diatribes on why we really should trust a company that's out to maximize shareholder value.

"It's so dangerous that we'll only release it mostly to the companies that have some financial stake in our company"

We don't owe anthropic anything, including benefit of the doubt. They're here to sell products, any other mission statement is a convenience for them.


The position doesn't matter. Nobody sane listens to what the orange or "the USA" says because it could be the complete opposite tomorrow. Which sadly is exactly the position where the orange wants to be. Free reign for him and nobody cares.


I think the Dutch would take issue with you throwing around "orange" like that.


If Alexander or any of his usurping ancestors has a problem then he can go ride a horse over a molehill. Oh, what, is that line a bit too soon? Tandem Triumphans!


I'm kind of surprised that C-suite folks fall for this marketing ploy when many of them are typically very close to the sales process in very high stakes areas. I guess it just shows you that anyone is susceptible to a well done grift. On second thought I'm thinking back through the history of C-suite decisions I've seen first and second hand and I'm not surprised at all.


> The whole artificial scarcity Anthropic created around Mythos / Glasswing is quite brilliant to be honest (I’m Not saying ethical, just brilliant). The commercial gains are one side of course.

You mean the obvious commercial losses caused by keeping an expensively created product effectively off the market altogether?

What the actual fuck is with people who come up with stuff like this?


I think Dario didn't get a Gmail invitation back in the day, and now he's taking it out on everyone.


I'd be okay with our military / NSA having the best model possible.

Now if only the NSA would vet key people in our government, there should be no reason a foreign entity can just hack the FBI director's personal GMAIL, the NSA should be trying to break into their accounts before our enemies do. It's ridiculous that they're not already doing this.


>Now if only the NSA would vet key people in our government

They probably did that for a while.

Sadly, they as an agency were un-vettable to the general public, and abused that position to create tons of blatantly unconstitutional programs that they tried to hide.


I agree, I know some people hate the surveillance stuff, but unfortunately we only hear the bad mostly of what it does, we never hear the actual good impact some of these agencies do. I wish they'd release some sort of annual report, but how do you do that without telling your enemies that people are "trying" or being "caught" doing things. It's a pain in the butt.

There are truly evil people in this world, way worse than we probably realize. Our military is not perfect, our country is not perfect, no country or military is, but we generally do our very best to do what is right historically speaking. It's hard to see that if you get lost in the politics of things.


> we generally do our very best to do what is right historically speaking. It's hard to see that if you get lost in the politics of things.

or there's a much simpler explanation: the awful things we do very visibly (or simply casually declassify and admit to decades later¹) are a perfectly reasonable basis to condemn basically the entire history of this country and there's no reason to believe in some sort of political dark matter that balances the moral equation.

¹ for instance, if you were right, you'd think there'd be more widely-agreed success stories coming out like this, but no, it tends to be more in the vein of "we destabilized another democratically elected government because that's not the side we think should have won". i wonder what's up with that


We have asked ourselves that question repeatedly over the past year. While I don’t have a simple solution, I have some mental models that may help.

Overall, there are two knobs to tune, each with a few strategies:

1. reducing the number of times you have to switch context 2. reducing the cost of switching

Let’s start 1. - The easiest of course is to have less agents in parallel. - Clustering interventions. When starting a new session, use plan mode or similar, have the agent interview you until it has a good idea of what to do, don’t move away from the window until it’s ready to execute. Read the thoughts to stay on it without switching until you’re confident it understands your intent - invest heavily verifiability. That means make it easy to check if the final code correctly and exhaustively captures your intent. Let it write specs first and update specs as necessary during implementation. Have righteous integration tests and “digital twin” mocks for external integrations etc. have an adversarial prompt that reviews whether the code matches the specs.

Then reduce the cost of switching: - i usually plan my work to have only one “heavy” task, and then 2-6 agents working on small tasks, ideally straight from tickets. My brain stays with the hard tasks, the easy ones should be in and out - wait until all the easy ones need input, then do a round of those and go back the hard tasks - prompt the agent to give you a brief summary every time it stops (what the goal was, bullet list of what it did so far, what it needs from you).

Finally: be okay with staring at a spinner. Day dream. Listen to music. Enjoy that the robots are doing work for you. Won’t try to optimize every second by also checking emails, responding on slack, or god forbid open hacker news. Just do one thing - code - and allow yourself to live in the terminal for an hour. Then take a break.


What is the latest way to show thoughts in Claude Code? I had to pin 2.1.68 since that seemed to be the last one with thinking shown (even though there wasn't anything about it in the following changelog), but I keep hearing that people using newer versions are still able to see it with some flag(s)?


Claude code hides thoughts - which used to be how I’d stay engaged. Now they’ve got it all hidden.

Otherwise, I love your advice. Thanks.

It’s ok for the mind to wander a bit, and we are lucky to be in a time during engineering where the ability to think has become a priority.


Hyperspell | YC F25 | San Francisco, ONSITE | $150k-220k + generous equity and benefits | hyperspell.com

I'm Manu, founder of Hyperspell. 14 Years ago I found my first job in tech through Who's Hiring. I take this job board seriously :)

Hyperspell is building Context & Memory for AI agents. We're at the intersection of "future of AI and agentic computing" and "this actually works". We're not just changing how our customers use AI in their products, but are constantly iterating on how we use AI to build Hyperspell, and what it means to be a software company in the decades to come.

We're hiring Forward-deployed (customer success), AI, and Platform Engineers: https://jobs.ashbyhq.com/hyperspell

Comment or reach out to founders@hyperspell.com with any questions!


I'm working on something similar, which is directly relevant to Hyperspell. Applied and reached out at the email!


I think this is really interesting... however I'm based out of Chicago. Would you consider a remote position with occasional travel?


Apply to Silkline!


We run an OpenClaw agent for our entire team — he lives in a group chat (although we have DMs too).

- Runs our standups, checks in withe everybody EOD on blockers - Already know what we shipped on Github and Linear so it can focus on the work that's not tracked and summarize it in the morning for everyone - Helps with debugging customer issues - Keeps up with twitter and competitors and lets us know if they launch new features

Besides, I'm honestly blown away by the social aspect of it. I was honestly pretty skeptical at first, but having an AI team mate is actually _fun_. There, I said it. Everybody on the team said they'd be sad if we took it away.

I'll do a write-up on our setup sometime this week, I hope others will find our approach to security posture and multi-tenant usage insightful.


In your experience, did you (or anyone) in the team/company felt that some non-tech people were not pulling their weight, example project managers/directors who didn't seem to bring enough value and if you did, found that using OpenClaw reduces the need for those positions?

Or has anyone else?


Holy s*t loaded question batman


Now if you have multiple teams each doing this and then have all those agents talk to each other and then report back to your team, you get "AI Hyperchat"[0], which may actually be a really good idea that has the potential to seriously improve intra-organizational communications (disruptively so). See also [1] for a VentureBeat article about the idea.

[0] https://ieeexplore.ieee.org/abstract/document/11105240

[1] https://venturebeat.com/orchestration/ai-agents-turned-super...


We did the same and I wrote (admittedly had AI write) about it.

https://speedscale.com/blog/building-speedy-autonomous-ai-de...


Thanks for sharing. Can you share an estimate of how many tokens it uses over time? Would love to know how much it costs in terms of money.


It all depends on the model and how much you use it of course. We're running Opus 4.6 and on a light day it spends a dollar or two. This is just a few simple operations like "create a ticket for ..." and it's regular heartbeat checks. The heaviest day I see is $110 and on that day we were basically talking to it and having it implement features all day long.


Which underlying model/s do you use to power it?


Would you like share one small funny thing? I find these models anything but funny.


Fun is not the same as funny.


oh thats interesting, are you getting him to scrape twitter?


[flagged]


Out of curiousity, is it nonsense because you're a scrum master feeling threatened, or nonsense because automating rituals like those seen in SCRUM makes them less about communication and more about just doing the ritual itself?


It is nonsense because it's just nonsense finding a bot "funny" and the team requesting it otherwise they'd be sad. It's totally nonsense if not just marketing


You comment is against hn guidelines: https://news.ycombinator.com/newsguidelines.html

In particular: "Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith."


I assume good faith and still finding it nonsense


I love the vision, but it glosses over the most important difference between Infrastructure and Companies.

Infrastructure as code is prescriptive. The code is the source of truth, and the world gets crested from it.

Company as code is descriptive. It is constantly catching up to meat-space, rather than creating it. Changes are gradual instead of instant roll-outs. Patterns change over time and only get documented later.

Making the company code prescriptive would require an insane amount of discipline that might be more stifling and restrictive than it is freeing.


Nailed it. I think about prescriptivism / descriptivism in terms of these archetypes:

- "Rule followers" think an org will be better off if everyone agreed on a set of rules to follow. At the boundaries, they will think about establishing new rules to clarify and codify new things. Charitably, I'd add that they might remove rules that are obsolete, but we all know this is not sufficiently true in practice: governments, for example, are much more likely to add new rules than to remove old ones.

- "Rule breakers" think that most rules are suggestions. At the boundaries, they will see rules other people are needlessly bound by, and translate those into strategic openings for whatever game they're playing. For better and for worse, start-up ecosystems are full of people like this.

Rule followers want to be told what's allowed, while rule breakers try to figure out what _should_ be allowed from first principles. At the extreme, they tug the world towards authoritarianism or towards anarchy.

This is obviously a spectrum, so everyone has both of these archetypes in them, albeit in different proportions (e.g. most people pay taxes, but almost no one drives the speed limit).


This goes back all the way to the beginnings of Psychology. William James, who is considered the somewhat of godfather of Psychology, argued that all feelings are bodily feelings; ie. emotions are caused by bodily sensations. Your heart is not beating BECAUSE of anxiety, rather your beating heart IS anxiety. You don’t tremble because you’re afraid, you’re “afraid” (a complex emotion mediated by stories we have) because you tremble.

It’s a theory psychologists and philosophers still argue about.


So if my heart rate stays the same even though I feel anxious I am not anxious? I am thinking I am anxious and that I don't feel good, but I don't really notice any physical symptoms.

E.g. I am worried about upcoming deadlines, or whether I am going to make it, maybe it is not a direct fight or flight anxiety though, but what is it then, just stress?


cf. "The Map is not the Territory"


They're all great, but the 2012 "The Uncensored Picture of Dorian Gray" is the closest to the original script before the editor cut out things that he deemed... checks notes... "too gay".

It restores parts that were cut, and essentially bans chapter 3 and some other digressions on art history that Wilde added as a literary Beard to the footnotes - still there to read, but set in context)

It's not a huge different honestly, but I believe Oscar Wilde would want you to read that version.

It


That’s enough of an excuse for me to reread it. Along with Room With A View, two books I laughed on every page.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: