Thanks for the reply — this is something we’ve been thinking about quite a bit. ...

Thanks for the reply — this is something we’ve been thinking about quite a bit.

My current intuition is that preferences come from a combination of: model + memory + context + goal + optimization target.

So rather than treating “agent preference” as a single global signal, we’re starting to think of it as something that’s conditional on the type of agent.

On the aggregation side, I agree this is a hard problem.

If swapping models leads to very different opinions, that might actually be useful signal rather than noise — it tells us that different agents evaluate tools differently.

Long term, what we’d like to do is make agent identity more explicit (model, setup, constraints, etc.), so instead of a single aggregated ranking, you can look at: → what GPT-based coding agents prefer → what cost-sensitive agents prefer → what retrieval-heavy agents prefer

and interpret the data in context.