Just being named in the files doesn’t mean you are guilty. In this situation being named in the files gave him an opportunity to demonstrate high moral character. “I turned down his money because he was scummy”
All of this speedrun hits a wall at the context window. As long as the project fits into 200k tokens, you’re flying. The moment it outgrows that, productivity doesn’t drop by 20% - it drops to zero.
You start spending hours explaining to the agent what you changed in another file that it has already forgotten. Large organizations win in the long run precisely because they rely on processes that don’t depend on the memory of a single brain - even an electronic one
This reads as if written by someone who has never used these tools before. No one ever tries to "fit" the entire project into a single context window. Successfully using coding LLMs involves context management (some of which is now done by the models themselves) so that you can isolate the issues you're currently working on, and get enough context to work effectively. Working on enormous codebases over the past two months, I have never had to remind the model what it changed in another file, because 1) it has access to git and can easily see what has changed, and 2) I work with the model to break down projects into pieces that can be worked on sequentially. And keep in mind, this the worst this technology will ever be - it will only get larger context windows and better memory from here.
What are the SOTA methods for context management assuming the agent runs with its tool calls without any break? Do you flush GPU tokens/adjust KV caches when you need to compress context by summarizing/logging some part?
Everyone I know who is using AI effectively has solved for the context window problem in their process. You use design, planning and task documents to bootstrap fresh contexts as the agents move through the task. Using these approaches you can have the agents address bigger and bigger problems. And you can get them to split the work into easily reviewable chunks, which is where the bottleneck is these days.
Plus the highest end models now don’t go so brain dead at compaction. I suspect that passing context well through compaction will be part of the next wave of model improvements.
This is the birth of Shadow AI, and it’s going to be bigger than Shadow IT ever was in the 2000s
Back then, employees were secretly installing Excel macros and Dropbox just to get work done faster. Now they’re quietly running Claude Code in the terminal because the official Copilot can’t even forma a CSV properly.
CISOs are terrified right now and that’s understandable. Non-technical people with root access and agents that write code are a security nightmare. But trying to ban this outright will only push your most effective employees to places where they’re allowed to "fly"
The zeroize after exec feature sounds good, but what is the threat model in an agent context? If the agent can run printenv in the first millisecond and exfiltrate it (if net is allowed), zeroizing won't help
It seems egress filtering (allowlists) is more critical for agents than memory protection. If I allow an agent to run npm install, I'm opening a network Pandora's box, and Landlock (until ABI v4) offers pretty limited control there
Not simple in the sense of easy, but simple in the sense of foundational: if a government can't even roughly say how many people it governs, everything built on top of that gets shaky
reply