> for the second, relevant question (would allowing this data make it easier and far more useful for gerrymandering and advertisement), you are obviously wrong.
Really? Why? When has gerrymandering ever relied on identifying individuals? Have any advertisers ever tried to use census data to identify individuals? That strikes me as highly unlikely - they are gonna use Facebook and Google, not some government database they’d have to deanonymise.
Gerrymandering is most effective when you know exact voting patterns of each household so you can draw the lines to get the result you want. Differential privacy blurs those boundaries and provides more room for the partisan hacks to make a fatal mistake.
Similar example Ohio legislature makes it illegal to drive with any THC of Cannabis products in the passenger compartment to crack down on people driving high, but there is nothing to prevent you driving with an open bottle of prescription opiates or benzos and popping those while you drive.
Bad choice of example, then. Restricting things that are uniquely and critical to planning and executing school shootings is a highly desirable outcome for regulation, in the eyes of a society that desires its youth to grow up without constant threat of murder at their mandatory educational institutions. That desire is not particularly uniform in the U.S. right now, in contrast with much of the world. Choosing murder sprees as an example supports regulations that have societal safety benefits, which is the opposite of what was intended. Perhaps a different example might have the desired effect?
"We shouldn't try to be like the big browsers because that's not what our Community wants."
This is just a path to irrelevance. Firefox had the ambition to be the default browser, what Chrome is now! It's a shame if they're going to spiral off into their niche.
You don't explain how scoring works, maybe it's obvious to MTG players? If you're using gpt 5.5, is there a possibility that it is biased in favour of models that think the way it does?
The scoring is just based on a simple prompt which is given the game state at the start and end of the turn and the log of tool calls and the final turn summary. The prompt asks it to evaluate the quality of the simulation from 0 to 10, and to give pass or fail for if it is legal.
It is far from ideal, but from my testing, even underpowered small LLMs that could not complete a single legal turn were reasonably good at judging if a simulation was legal. The final judging was all done by gpt-5.5 (medium) which might have given the OpenAI models an advantage, but from all the simulations I looked at, it seemed pretty fair.
This benchmark ended up be more of a test of how well an LLM can call tools without contradicting itself or backtracking. Most of the failures were not because of breaking magic rules, but because it could not sequence the tool calls correctly.
The failure mode seems to be that some models are overly trained to start tool calls, even when the model itself knows that it should not be calling the tool. Both of those examples were not errors because the judge prompt said they were illegal. In both of those examples the model stopped the simulation itself knowing that it made a tool error.
The Opus 4.8 examples are especially weird because it will consistently make the same tool call error 2 or 3 times in a row, and it will put things like "placeholder" or "noop" for the tool call reason.
I think "killed, enslaved, and/or tortured most of the citizens of the regions it's already captured" is an exaggeration. That is not to deny Russian war crimes, which are clearly large-scale and horrendous. But I don't think the majority of people in those regions have been killed or enslaved.
I'm a big supporter of Ukraine and have donated to the war effort and hosted Ukrainian refugees.
Do you believe that the majority of inhabitants even of Mariupol have been killed, tortured or enslaved? If so, could you point me to a source for that? Wikipedia reports its population has dropped from about 425K to an estimated 120K, but reports an estimate that 200K people fled; they report a high-end estimate of 25K people killed during the 2022 siege itself.
I wonder how much of this is truly smart as in planned/intentional behaviour. Couldn’t it just evolve? Suppose you hang around something that you want to eat . And you make a lot of noise. So now predators show up. none of this was planned, but now you have a fitness advantage.
A) I am allergic to the word Just ;-) It means you stop being curious. How about one or more of the following?
B) Say you have a slow optimizer in a fast world: a lot of the time the optimal solution is going to be some form of computational generalization. Now you have meta-optimization. Life seems to enjoy doing this recursively.
C) Crow intelligence is clearly highly evolved, so you're technically correct, best kind of correct. Though here I'd argue that a very parsimonious answer is single-lifespan learned behavior. You're applying an existing learning system, no new mechanisms needed. (As opposed to positing some new evolved fixed action pattern).
D) There's not even anything stopping it from being planned behavior. Searle is struck out because it is biological; and no one can accuse us of anthropomorphism HERE!
E) Actually, for sparse events, planning using a world model can be more parsimonious. Apply existing model to new problem, again no extra mechanism needed. Which one works better for a particular entity in a particular situation depends on tradeoffs. (For a human example: see eg Memory items vs checklists vs airmanship in eg aviation)
F) That said, I'd even count evolution as a form of intelligence (well... it's an optimizer at least). I will literally die on this hill, and so will you O:-) (unless you represent optimums as valleys) ---> Plot evolution as a dynamic system in phase space, or with your typical hill-climber/gradient descent representations. How much does the trajectory differ from other optimizers? What happens if the 'terrain' is very bumpy with many local optimums? What if it deforms as you cross it?
reply