Bah. It's a really cool idea, but a rather crude way to measure the outputs.
If you just ask the model in plain text, the actual "decision" whether it detected anything or not is made by by the time it outputs the second word ("don't" vs. "notice"). The rest of the output builds up from that one token and is not that interesting.
A way cooler way to run such experiments is to measure the actual token probabilities at such decision points. OpenAI has the logprob API for that, don't know about Anthropic. If not, you can sort of proxy it by asking the model to rate on a scale from 0-9 (must be a single token!) how much it think it's being under influence. The score must be the first token in its output though!
Another interesting way to measure would be to ask it for a JSON like this:
"possible injected concept in 1 word" : <strength 0-9>, ...
Again, the rigid structure of the JSON will eliminate the interference from the language structure, and will give more consistent and measurable outputs.
It's also notable how over-amplifying the injected concept quickly overpowers the pathways trained to reproduce the natural language structure, so the model becomes totally incoherent.
I would love to fiddle with something like this in Ollama, but am not very familiar with its internals. Can anyone here give a brief pointer where I should be looking if I wanted to access the activation vector from a particular layer before it starts producing the tokens?
> I would love to fiddle with something like this in Ollama, but am not very familiar with its internals. Can anyone here give a brief pointer where I should be looking if I wanted to access the activation vector from a particular layer before it starts producing the tokens?
Look into how "abliteration" works, and look for github projects. They have code for finding the "direction" verctor and then modifying the model (I think you can do inference only or just merge the modifications back into the weights).
I am playing around with interactive workflow where the model suggests what can be wrong with a particular chunk of code, then the user selects one of the options, and the model immediately implements the fix.
Biggest problem? Total Wild West in terms of what the models try to suggest. Some models suggest short sentences, others spew out huge chunks at a time. GPT-OSS really likes using tables everywhere. Llama occasionally gets stuck in the loop of "memcpy() could be not what it seems and work differently than expected" followed by a handful of similar suggestions for other well-known library functions.
I mostly got it to work with some creative prompt engineering and cross-validation, but having a model fine-tuned for giving reasonable suggestions that are easy to understand from a quick glance, would be way better.
I haven't tried your exact task, of course, but I've found a lot of success in using JSON structured output (in strict mode), and decomposing the response into more fields than you would otherwise think useful. And making those fields highly specific.
For example: make the suggestion output an object with multiple fields, naming one of them `concise_suggestion`. And make sure to take advantage of the `description` field.
For people not already using structured output, both OpenAI and Anthropic consoles have a pretty good JSON schema generator (give prompt, get schema). I'd suggest using one of those as a starting point.
Adding entire files into the context window and letting the AI sift through it is a very wasteful approach.
It was adopted because trying to generate diffs with AI opens a whole new can of worms, but there's a very efficient approach in between: slice the files on the symbol level.
So if the AI only needs the declaration of foo() and the definition of bar(), the entire file can be collapsed like this:
Any AI-suggested changes are then easy to merge back (renamings are the only notable exception), so it works really fast.
I am currently working on an editor that combines this approach with the ability to step back-and-forth between the edits, and it works really well. I absolutely love the Cerebras platform (they have a free tier directly and pay-as-you-go offering via OpenRouter). It can get very annoying refactorings done in one or two seconds based on single-sentence prompts, and it usually costs about half a cent per refactoring in tokens. Also great for things like applying known algorithms to spread out data structures, where including all files would kill the context window, but pulling individual types works just fine with a fraction of tokens.
this works if your code is exceptionally well composed. anything less can lead to looney tunes levels of goofiness in behavior, especially if there’s as little as one or two lines of crucial context elsewhere in the file.
This approach saves tokens theoretically, but i find it can lead to wastefulness as it tries to figure out why things aren’t working when loading the full file would have solved the problem in a single step.
It greatly depends on the type of work you are trying to delegate to the AI. If you ask it to add one entire feature at a time, file level could work better. But the time and costs go up very fast, and it's harder to review.
What works for me (adding features to huge interconnected projects), is think what classes, algorithms and interfaces I want to add, and then give very brief prompts like "split class into abstract base + child like this" and "add another child supporting x,y and z".
So, I still make all the key decisions myself, but I get to skip typing the most annoying and repetitive parts. Also, the code don't look much different from what I could have written by hand, just gets done about 5x faster.
Yep and it collapses in the enterprise. The code you’re referencing might well be from some niche vendor’s bloated library with multiple incoherent abstractions, etc. Context is necessarily big
Ironically, that's how I got the whole idea of symbol-level edits. I was working on project like that, and realized that a lot of work is actually fairly small edits. But to do one right, you need to you need to look through a bunch of classes, abstraction layers, and similar implementations, and then keep in your head how to get an instance of X from a pointer to Y, etc. Very annoying repetitive work.
I tried copy-pasting all the relevant parts into ChatGPT and gave it instructions like "add support for X to Y, similar to Z", and it got it pretty well each time. The bottleneck was really pasting things into the context window, and merging the changes back. So, I made a GUI that automated it - showed links on top of functions/classes to quickly attach them into the context window, either as just declarations, or as editable chunks.
That worked faster, but navigating to definitions and manually clicking on top of them still looked like an unnecessary step. But if you asked the model "hey, don't follow these instructions yet, just tell me which symbols you need to complete them", it would give reasonable machine-readable results. And then it's easy to look them up on the symbol level, and do the actual edit with them.
It doesn't do magic, but takes most of the effort out of getting the first draft of the edit, than you can then verify, tweak, and step through in a debugger.
Totally agree with your view on the symbolic context injection. Is this how things are done with code/dev AI right now? Like if you consider the state of the art.
There's a pretty good sweet spot in between vibe coding and manual coding.
You still think out all the classes, algorithms, complexities in your head, but then instead of writing code by hand, use short prompts like "encapsulate X and Y in a nested class + create a dictionary where key is A+B".
This saves a ton of repetitive manual work, while the results are pretty indistinguishable from doing all the legwork yourself.
I am building a list of examples with exact prompts and token counts here [0]. The list is far from being complete, but gives the overall idea.
I'm still finding the right sweet spot personally. I would love to only think in architecture, features and interfaces and algorithms, and leave the class and function design fully to the LLM. At this point this almost works, but requires handholding and some retroactive cleanup. I still do it because it's necessary, but I grow increasingly tired of having to think to close to the code level, as I see it more and more as an implementation detail. (in before the code maximalists get in my case).
Most of the AI hiccups come from the sequential nature of generating responses. It gets to a spot where adhering to code structure means token X, and fulfilling some common sense requirement means token Y, so it picks X and the rest of the reply is screwed.
You can get way better results with incremental refinement. Refine brief prompt into detailed description. Refine description into requirements. Refine requirements into specific steps. Refine steps into modified code.
I am currently experimenting with several GUI options for this workflow. Feel free to reach out to me if you want to try it out.
Cool. The part that is released already is described here [0]. You can point at some code and give brief instructions, and it will ask the model to expand them, giving several options.
E.g. if you point at a vector class and just "Distance()" for prompt, it will make assumptions like "you want to add a function calculating distance from (0,0)", "function calculating distance between 2 vectors", etc. It runs pretty fast with models like LLaMA, so you can get small routine edits done much faster than by hand.
The part I am currently experimenting with are one-click commands like "expand" or "I don't like the selected part. Give me other options for it". I think, I'll get it to a mostly usable state around Monday. Feel free to shoot an email to the address in the contact page and I'll send you a link to the experimental build.
I use hierarchical trees of guidelines (just markdown sections) that are attached to every prompt. It's somewhat wasteful in terms of token, but works well. If AI is not creating a new wrapper, it will just ignore the "instructions for creating new wrappers" section.
It works with low-level C/C++ just fine as long as you rigorously include all relevant definitions in the context window, provide non-obvious context (like the lifecycle of some various objects) and keep your prompts focused.
Things like "apply this known algorithm to that project-specific data structure" work really well and save plenty of time. Things that require a gut feeling for how things are organized in memory don't work unless you are willing to babysit the model.
I don't see it as anything bad. LLMs are a interesting, alright. They can very quickly do a lot of mind-numbing work that used to be done by hand, and then they can stumble and produce total nonsense that no human being would even consider writing. And then you tweak the prompt in a weird way, and you're suddenly back in business.
To me, it feels like studying a new physical phenomenon. Like when Nicola Tesla was playing around with coils and wires, eventually loading to the creation of an entire industry.
Except, with LLMs, you don't need multi-million dollar equipment to play around with models. You can get pretty cool stuff done with a regular GPU, and even cooler if you use cloud.
I would say, if you are not spending some spare time fiddling around with LLMs trying to get them do some of the work you would otherwise do by hand, you are missing out.
You can't trust LLMs to copy-paste code, but you can explicitly pick what should be editable, and also review the edits in a more streamlined way.
I am actually working on a GUI for just that [0]. The first problem is solved by having explicit links above functions and classes whether to include them in the context window (with an option to remove bodies of functions, just keeping the declarations). The second one is solved by a special review mode where it auto-collapses functions/classes that were unchanged, and having an outline window that shows how many blocks were changed in each function/class/etc.
The tool is still very early in development with tons of more functionality coming (like proper deep understanding of C/C++ code structure), but the code slicing and outline-based reviewing already works just fine. Also, works with DeepSeek, or any other model that can, well, complete conversations.
It's not really that specific. There's a actually a hidden command there for comparing the current source file against an older version (otherwise, good luck testing the diff GUI without pre-recorded test cases). If anyone's interested, it can be very easily converted into a proper feature.
That said, when you review human work, the granularity is usually different. I've actually been heavily using AI to do minor refactoring like "replace these 2 variables with a struct and update all call sites" and the reviewing flow is just different. AI makes fairly predictable mistakes, and once you get the hang of it, you can spot them before you even fully read the code. Like groups of 3 edits for all call sites, and one call site with 4. Or things like removed comments or renamed variables you didn't ask to rename. Properly collapsing irrelevant parts makes much bigger difference than with human-made edits.
It's just faster and less distracting. What is a total game-changer for me, is small refactoring. Let's say, you have a method that takes a boolean argument. At some point you realize you need a third value. You could replace it with an enum, but updating a handful of call sites is boring and terribly distracting.
With LLMs I can literally type "unsavedOnly => enum Scope{Unsaved, Saved, RecentlySaved (ignore for now)}" and that's it. It will replace the "bool unsavedOnly" argument with "scope Scope", update the check inside the method, and update the callers. If had to do it by hand each time, I would have lazied out and added another bool argument, or some other kind of a sloppy fix, snowballing the technical debt. But if LLMs can do all the legwork, you don't need sloppy fixes anymore. Keeping the code nice and clean doesn't mean a huge distraction and doesn't kick you out of the zone.
Looked into it a lot. There are deterministic refactoring tools for things like convert for into foreach, or create constructor based on list of fields, but they still don't cover a lot of use cases.
I tried using a refactoring tool for reordering function arguments. The problem is, clicking through various GUI to get your point across is again too distracting. And there are still too many details. You can't say something like "new argument should be zero for callers that ignore the return value". It's not deterministic, and each case is slightly different from others. But LLMs handle this surprisingly well, and the mistakes they make are easy to spot.
What I'm really hoping to do some day is a "formal mode" where the LLM would write a mini-program to mutate the abstract syntax tree based on a textual refactoring request, thus guaranteeing determinism. But that's a whole new dimension of work, and there are numerous easier use cases to tackle before that.
It's not the IP, it's sadly how people react. Some folks will be appreciative of help, credit to them. Others will immediately get back how they tried it, it didn't work and now they need you to rewrite everything, or do their project for them, or redesign your product to match what they want it to be. And if you politely refuse, it quickly escalates to threats of trashing your business through every channel, and other things.
So, the safest thing to do is not give details at all, or "leak" them like another reply in this thread mentions.
Sadly, that stuff backfires. The researcher will publish your response along with some snarky remarks how you are refusing to fix a "critical issue", and next time you are looking for a job and the HR googles up your name, it pops up, and -poof-, we'll call your later.
I used to work on a kernel debugging tool and had a particularly annoying security researcher bug me about a signed/unsigned integer check that could result in a target kernel panic with a malformed debug packet. Like you couldn't do the same by just writing random stuff at random addresses, since you are literally debugging the kernel with full memory access. Sad.
Just be respectful and not snarky. And be clear about your boundaries.
What I do is I add the following notice to my GitHub issue template: "X is a passion project and issues are triaged based on my personal availability. If you need immediate or ongoing support, then please purchase a support contract through my software company: [link to company webpage]".
I wish people would make distinction regarding the size/scope of the AI-generated parts. Like with video copyright laws, where a 5-second clip from a copyrighted movie is usually considered fair use and not frowned upon.
Because for projects like QEMU, current AI models can actually do mind-boggling stuff. You can give it a PDF describing an instruction set, and it will generate you wrapper classes for emulating particular instructions. Then you can give it one class like this and a few paragraphs from the datasheet, and it will spit out unit tests checking that your class works as the CPU vendor describes.
Like, you can get from 0% to 100% test coverage several orders of magnitude faster than doing it by hand. Or refactoring, where you want to add support for a particular memory virtualization trick, and you need to update 100 instruction classes based on straight-forward, but not 100% formal rule. A human developer would be pulling their hairs out, while an LLM will do it faster than you can get a coffee.
Not all jurisdictions are the US, and not all jurisdictions allow fair use, but instead have specific fair dealing laws. Not all jurisdictions have fair dealing laws, meaning that every use has to be cleared.
There are simple algorithms that everyone will implement the same way down to the variable names, but aside from those fairly rare exceptions, there's no "maximum number of lines" metric to describe how much code is "fair use" regardless of the licence of the code "fair use"d in your scenario.
Depending on the context, even in the US that 5-second clip would not pass fair use doctrine muster. If I made a new film cut entirely from five second clips of different movies and tried a fair use doctrine defence, I would likely never see the outside of a courtroom for the rest of my life. If I tried to do so with licensing, I would probably pay more than it cost to make all those movies.
Look up the decisions over the last two decades over sampling (there are albums from the late 80s and 90s — when sampling was relatively new — which will never see another pressing or release because of these decisions). The musicians and producers who chose the samples thought they would be covered by fair use.
Qemu can make the choice to stay in the "stone age" if they want. Contributors who prefer AI assistance can spend their time elsewhere.
It might actually be prudent for some (perhaps many foundational) OSS projects to reject AI until the full legal case law precedent has been established. If they begin taking contributions and we find out later that courts find this is in violation of some third party's copyright (as shocking as that outcome may seem), that puts these projects in jeopardy. And they certainly do not have the funding or bandwidth to avoid litigation. Or to handle a complete rollback to pre-AI background states.
If you just ask the model in plain text, the actual "decision" whether it detected anything or not is made by by the time it outputs the second word ("don't" vs. "notice"). The rest of the output builds up from that one token and is not that interesting.
A way cooler way to run such experiments is to measure the actual token probabilities at such decision points. OpenAI has the logprob API for that, don't know about Anthropic. If not, you can sort of proxy it by asking the model to rate on a scale from 0-9 (must be a single token!) how much it think it's being under influence. The score must be the first token in its output though!
Another interesting way to measure would be to ask it for a JSON like this:
Again, the rigid structure of the JSON will eliminate the interference from the language structure, and will give more consistent and measurable outputs.It's also notable how over-amplifying the injected concept quickly overpowers the pathways trained to reproduce the natural language structure, so the model becomes totally incoherent.
I would love to fiddle with something like this in Ollama, but am not very familiar with its internals. Can anyone here give a brief pointer where I should be looking if I wanted to access the activation vector from a particular layer before it starts producing the tokens?