More

ivanbelenky · 2025-10-13T15:14:16 1760368456

- excalidraw for the terminal

- carcassonne game agent

Everything is still on private repos because it is too nasty, and Im shy

ivanbelenky · 2025-10-12T07:37:27 1760254647

Thanks for sharing. I did not know this law existed and had a name. I know nothing about nothing but it appears to be the case that the interpretation of metrics for policies assume implicitly the "shape" of the domain. E.g. in RL for games we see a bunch of outlier behavior for policies just gaming the signal.

There seems to be 2 types

- Specification failure: signal is bad-ish, a completely broken behavior --> local optimal points achieved for policies that phenomenologically do not represent what was expected/desired to cover --> signaling an improvable reward signal definition

- Domain constraint failure: signal is still good and optimization is "legitimate", but you are prompted with the question "do I need to constraint my domain of solutions?"

  - finding a bug that reduces time to completion of a game in a speedrun setting would be a new acceptable baseline, because there are no rules to finishing the game earlier
  
  - shooting amphetamines on a 100m run would probably minimize time, but other factors will make people consider disallowing such practices.

Eisenstein · 2025-10-12T09:53:36 1760262816

I view Goodhart's law more as a lesson for why we can never achieve a goal by offering specific incentives if we are measuring success by the outcome of the incentives and not by the achievement of the goal.

This is of course inevitable if the goal cannot be directly measured but is composed of many constantly moving variables such as education or public health.

This doesn't mean we shouldn't bother having such goals, it just means we have to be diligent at pivoting the incentives when it becomes evident that secondary effects are being produced at the expense of the desired effect.

godelski · 2025-10-12T19:58:11 1760299091

  > This is of course inevitable if the goal cannot be directly measured

It's worth noting that no goal can be directly measured[0].

I agree with you, this doesn't mean we shouldn't bother with goals. They are fantastic tools. But they are guides. The better aligned our proxy measurement is with the intended measurement then the less we have to interpret our results. We have to think less, spending less energy. But even poorly defined goals can be helpful, as they get refined as we progress in them. We've all done this since we were kids and we do this to this day. All long term goals are updated as we progress in them. It's not like we just state a goal and then hop on the railroad to success.

It's like writing tests for code. Tests don't prove that your code is bug free (can't write a test for a bug you don't know about: unknown unknown). But tests are still helpful because they help evidence the code is bug free and constrain the domain in which bugs can live. It's also why TDD is naive, because tests aren't proof and you have to continue to think beyond the tests.

[0] https://news.ycombinator.com/item?id=45555551

esafak · 2025-10-13T15:18:09 1760368689

You can measure revenue exactly; it has limited precision.

ivanbelenky · 2025-07-16T23:42:47 1752709367

toy game engine for the tty concept

ivanbelenky · 2025-06-28T17:43:17 1751132597

https://github.com/ivanbelenky/RL one the great pleasures in my life was implementing almost completely this book

jxjnskkzxxhx · 2025-06-29T07:29:46 1751182186

Pretty cool thank you for sharing. How long did this take you?

ivanbelenky · 2025-05-08T22:27:30 1746743250

Awesome

ivanbelenky · 2025-05-08T05:01:03 1746680463

Contributions are welcomee!!

ivanbelenky · 2025-05-08T03:43:45 1746675825

it can be extended arbitrary given the proper dataset. For the current default settings, road class covers everything below 3. This translates to

FREEWAY = 1 # Freeway (Multi-lane, controlled access)

PP_TH = 2 # Primary Provincial/Territorial highway

SP_TH_MA = 3 # Secondary Provincial/territorial highway/ municipal arterial

MC_USP_TH = 4 # Municipal collector/Unpaved secondary provincial/territorial highway

LS_WR = 5 # Local street/ winter road

3 was the sweetspot. The dataset can be explored here in case you want to get an intuition on detail level. https://geodata.bts.gov/datasets/usdot::north-american-roads...

ivanbelenky · 2025-05-08T03:27:09 1746674829

This is a project I did not maintain much for the past few months, but recently I migrated many other codebases to be uv managed, and I find this optimal for many reasons. Happy to receive contributions and fix the quite strict requirements I set. That would probably be the faster way.

ivanbelenky · 2025-05-08T02:21:55 1746670915

Thx for the heads up on optimizations available. The “Approximations” comment does not apply to the shortest path calculation, but rather to the distances and upper bound times estimations. This is the consequence of enabling routing for points that dont exist as nodes (closest node approximation).

ivanbelenky · 2025-05-08T00:51:43 1746665503

"This NTAD dataset is a work of the United States government as defined in 17 U.S.C. § 101 and as such are not protected by any U.S. copyrights. This work is available for unrestricted public use."

I based my work on this, maybe the link is out, thx for testing. The dataset has already been consumed and collapsed into a smaller graph representation.