It's pretty interesting, but Bebop is barely treat motivated, and mostly inside. Once you take him outside, especially in open areas, he becomes more interested in tracking movement and chasing things.
I believe it's explained by the job we've asked Greyhounds to do: see movement, get released, run after it. Once you let go, that's it, the dog needs to be motivated enough by the running animal or lure, and there's no chance to reinforce the loop once it starts!
If I repeat this experiment, I'll be a lot more careful about that, or also alternate which direction I'm standing in.
I didn't realize R/L preference was biased until I did the data analysis, and during the trials he was picking from both hands often enough that I perceived it as roughly 50% depending on the treats!
The initial set-up was to do 3 comparisons of the 5 treats (30 trials), alternate between right and left hand, then write a quick python script to randomize the order.
A bit more than halfway through the experiment, I ran the model and realized that A/D/E were the only contenders left, so I removed the B/C trials and added more A/D/E trials.
Author here: I did a quick experiment with my Greyhound, Bebop, to figure out the treat he prefers best using pair-wise comparison analyzed with the Bradley-Terry model. Same tech as Elo scores in chess, and several other places! Enjoy!
Practically speaking, Bebop is very excited to go out, and vetos are most common going home.
A plan with options might handle that, but that makes it trickier to satisfy the novelty constraint if each days plan needs to account for what was made on the previous day. Could be interesting to see what a plan with optionality looks like!
As you said, if Bebop refuses to go home, then the model has to remember the previous state, and the difficulty increases a lot. Usually, this kind of thing would be modeled with Markov rewards, using states and transition probabilities.
It is a fun problem. I really enjoy writing like this because it always gives me something worth thinking about.
Author Here: I wrote this about using numerical optimization to solve problems in my daily life. I'm interested if anyone else has done the same, and what worked for them!
Author here: I wrote this article about a failure I had applying LLM to an enterprise codebase, and ran a quick pilot study on a similar repo to understand the failures better.
I'd interested to know, are folks seeing the same things?
I believe it's explained by the job we've asked Greyhounds to do: see movement, get released, run after it. Once you let go, that's it, the dog needs to be motivated enough by the running animal or lure, and there's no chance to reinforce the loop once it starts!
reply