Hacker Newsnew | past | comments | ask | show | jobs | submit | lottaFLOPS's commentslogin

related research that was also announced this week: https://www.textquests.ai/


They seem to be going for a much simpler route of just giving the LLM a full transcript of the game with its own reasoning interspersed. I didn't have much luck with that, and I'm worried it might not be effective once we're into the hundreds of turns because of inadvertent context poisoning. It seems like this might indeed be what happens, given the slowing of progress indicated in the paper.


Very interesting how they all clearly suck at it. Even with hints, they can't understand the task enough to complete the game.


that's a great tracker. How often is the laderboard updated?


it’s rolling out to users on all tiers, so no need to wait. I tried it and saw outputs from many others. it’s good. very good


Chat GPT requires logging in with an email. I hesitated on that.

That's why I prefer to wait.


You can create e-mail addresses for single use, even temporary ones.


New people learn new things every day :)

https://xkcd.com/1053/


I really appreciated how this captures the varied perspectives on remote vs. in-office work in a nuanced way.

I also think you hit the nail on the head in explaining the incentives and default behavior we’ve seen from executives. Before the pandemic, I heard the same explanation for open office plans, despite intense dislike from many: “well, around 50% don’t like this, but we can’t please everyone.”

The only solution I can think of is not mandating a single solution. Allowing individuals/teams to choose what works best for them is more complicated and risky, but also offers the greatest potential. I’m not sure there’s a way to make execs feel more comfortable with this approach though. It is risky, and it requires a lot of trust.

I’m hopeful as more companies take the leap to flexible remote policies, the results will speak for themselves. If the fears among the more conservative/risk-averse execs aren’t realized, opinions may soften. At the same time, I’m expecting there will be enough anecdotes to serve as “evidence” for any staunchly-held perspective.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: