More

LifeIsBio · on July 4, 2023

About a year ago I did some work collecting interesting blogs from HN users and shared it here:

https://news.ycombinator.com/item?id=32291993

LifeIsBio · on June 23, 2023

It is exciting! As others in the thread are saying, the cost of individuals with severe rare diseases are also very high.

Here’s a recent attempt at quantifying the costs across all rare disease:

https://chiesirarediseases.com/assets/pdf/chiesiglobalraredi...

LifeIsBio · on May 21, 2023

That’s exactly what happened. :)

LifeIsBio · on May 20, 2023

This is a reference to: https://news.ycombinator.com/item?id=36012360

LifeIsBio · on May 20, 2023

Here's a thread where I fed all of his questions to ChatGPT-4.

https://news.ycombinator.com/item?id=36014796

It seems like his graduate student did him a great disservice by feeding the questions to 3.5

throwme_123 · on May 20, 2023

This should be the top comment.

Not only by providing the correct SotA, but also noting that the graduate student, probably at an expensive University, was so "cheap" as not to buy the cheap tools for their research. Imagine physicists from the 1900s working without tools and not being able to do experiments because "we would have to buy radium so let's try with free iron that I have instead". "Radioactivity is not a thing".

rahimnathwani · on May 20, 2023

Yes, totally, especially given this was written only a month ago!

  The student referred me to a recent arXiv paper 2303.12712 [cs.CL] about GPT-4, which is apparently behind a paywall at the moment but does even better than the system he could use (https://chat.openai.com/).

I wonder the graduate student considered paying the $20 and/or asking Knuth to pay.

LifeIsBio · on April 28, 2023

The game “20 questions” is probably the hardest I’ve seen chatGPT fail.

What’s interesting about the game is that, at first pass, there’s no ambiguity. All questions need to be answered with “Yes” or “No”. But many questions asked during the game actually have answers of “it depends”.

For example, I was thinking of “peanut butter” and chatGPT asked me “Does it fit in your hand?” as well as “Is it used in the kitchen?”. Given my answers, chatGPT spent the back half of its questions on different kitchen utensils. It never once considered backing up and verifying that there wasn’t some misunderstanding.

I played three games with it, and it made the same mistake each time.

Of course, playing the game via text loses a lot of information relative to playing IRL with your friends. In person, the answerer would pause, hum, and otherwise demonstrate that the question asked was ambiguous given the restrictions of the game.

Regardless, it was clear that chatGPT wasn’t accounting for ambiguity.

DonaldPShimoda · on April 28, 2023

> It never once considered backing up and verifying that there wasn’t some misunderstanding.

Of course not; ChatGPT doesn't "consider". It doesn't think, it doesn't know. It can't identify that there was a misunderstanding of its own volition.

All ChatGPT does is use a (very sophisticated!) statistical analysis to generate text that conforms to an expectation of what a human response to a similar prompt might look like. It has been trained well in so far as it is able to produce prompts that seem like a human may have written them, but it doesn't reveal cognitive processes like "reconsidering" because it doesn't have any.

schrodingerscow · on April 28, 2023

Wow never heard this comment before

DonaldPShimoda · on April 29, 2023

Comments of that nature will continue so long as there are people who don't understand how language models work (or choose to misrepresent them).

tjr · on April 28, 2023

20-some years ago, I had this "20 questions" handheld electronic game that was eerily good at winning. I imagine it was a bunch of well-programmed tables of data, but in any case, it's certainly possible for a machine to do well at this game.

I think the more we see ChatGPT do things like "oh, I know this game -- I'm going to run a 20-year-old 20 Questions subroutine that is not part of my neural network language model to generate responses", it will become even more impressive.

helen___keller · on April 28, 2023

> I think the more we see ChatGPT do things like "oh, I know this game -- I'm going to run a 20-year-old 20 Questions subroutine that is not part of my neural network language model to generate responses", it will become even more impressive.

Agreed. Incidentally I’ve built a little toy version of a runtime for exactly this purpose - there’s a translation layer that’s given a bunch of available “APIs” (fed through the LLM context), and breaks down a high level goal into a structured series of API calls.

the runtime parses these API calls, and natively executes some (e.g. run a program, write to the file system) and others result in LLM invocations.

I’m sure OpenAI and crew are way ahead of me here, of course. I’m excited to see what the future holds in this field.

JohnFen · on April 28, 2023

The first AI-style program I ever wrote (about 25 years ago. Yes, I'm old) played 20 questions, but it would "learn" from prior games, so the more you played, the better it performed.

It got extremely good after a few hundred games.

smolder · on April 28, 2023

Yeah, ChatGPT could integrate Akinator[0] and trivially be great at the game. Without the help, though, It's a good, revealing benchmark for the LLMs ability.

[0] https://en.akinator.com

nr2x · on April 28, 2023

LLM for the foreseeable future function most reliably as a user interface layer for other system. I use GPT to “translate” natural language down into the API calls that get real data and it works great. I’d never trust it beyond that.

marcosdumay · on April 28, 2023

You trained it with "this phrase means this command" examples? How do you make it use your custom API? (Or you are not using your custom API?)

nr2x · on April 28, 2023

Basically yeah, just a pretty detailed set of prompts and then “turn the next message into an api call” and it basically works perfectly.

When I first heard the term “prompt engineer” I rolled my eyes, but now that I’ve gotten into it I see it’s really an art form.

rjbwork · on April 28, 2023

"Green Glass Door" also completely stumped it. It just could not deduce that the trick was semantic at the word representation level, rather than something related to the object that the word describes.

What's funny about 20 questions is that Akinator has been absolutely slaying it for like 20 years now.

ryukafalz · on April 28, 2023

What happens if you answer with something approximating the hemming and hawing rather than a straight yes or no? You can encode that into text, it's just less common outside of very informal chat conversations.

6gvONxR4sf7o · on April 29, 2023

I just did a 20-questions with it, and was surprised by how bad gpt4 did. Then for fun, I turned it around and had me be the guesser. It's weird and surreal to play 20-questions when you know that the clue-giver doesn't have an answer in their mind (or more literally, there isn't a single answer in any stateful form while you play), but is instead just eventually saying "yes that's what I was thinking of" when it's statistically appropriate.

numtel · on April 30, 2023

With the code execution plugin, one could theoretically ask chatgpt to generate a salted hash of their answer at the start that's revealed at the end to prove it was correct.

Without any plugins, chatgpt will happily return sha hashes and salts when I asked it to play rock paper scissors this was. The only trouble was, the hashes were totally wrong.

stainablesteel · on April 28, 2023

i love your example, i wonder if this kind of game can be implemented in future training scenarios

we as humans understand ambiguity so much easier because we learn to speak and interact before we write, and writing ambiguity is way less obvious if you've never experienced it

eternalban · on April 28, 2023

I'm not sure I would think "food" when someone says they "use [it] in the kitchen". You "use" food? (Used in cooking != used in kitchen, imo)

JohnFen · on April 28, 2023

I use food (including peanut butter) in cooking. I cook in the kitchen. Therefore peanut butter is a thing I use in the kitchen. Seems correct and proper to me.

The ambiguity as I see it is that the kitchen isn't the only place I use peanut butter. I've eaten it (which I think counts as "using") in other rooms. I've even made peanut-butter sandwiches (properly "using" it) in the living room before.

version_five · on April 28, 2023

That's his whole point. It's possible to consider it technically correct, but it's a red herring.

eternalban · on April 28, 2023

Well, the alleged point is challenged. If playing this game, the questioner must constantly verify that the other party is using the language properly, you'll exhaust that 20 q limit rather quickly.

- is it used in the kitchen?

- yes.

- [well, kitchen appliances, here we go ..] is it ..?

...

- [aha. meat intelligence no speak proper English?] Is this thing you use in kitchen edible?

- Oh, yeah.

- [oh dear. we can not let meat machines govern this planet...]

smolder · on April 28, 2023

I use peanut butter as an ingredient for sandwiches, usually in my kitchen.

eternalban · on April 28, 2023

Yes. You use edible things in preparing or cooking food (which may happen in the kitchen). 'Use' maps to food prep (the act) but never to prep location. Only in cases where the thing has both general edible and food preparation usage -- "I use honey extensively in the kitchen" for example -- does "use" and "edible" make sense.

yorwba · on April 28, 2023

But peanut butter has general edible and food preparation usage quite similar to honey, doesn't it? You can spread it on a slice of bread to eat directly or use it as a baking ingredient, but you probably wouldn't eat it by the spoonful straight from the container. (Or maybe that's how people usually eat peanut butter, I kind of don't want to know.)

eternalban · on April 28, 2023

guilty as charged: spoon + jar = happy mouth.

DreamyCrab · on April 28, 2023

Yes, I do.

idiocrat · on April 28, 2023

"He saw that gas can explode."

This ambiguous sentence stuck in my head some 30 years ago, when the AI was popular at that time.

There was a research paper discussing the issue of ambiguity.

DougMerritt · on April 28, 2023

Right -- although many things that are ambiguous in text are disambiguated in actual speech, so the problems that arise with audio speech are not wholly the same as with text.

A classic example is the word "record", which has first syllable stress as a noun, but second syllable stress as a verb. "I bought a RECord" vs "Please reCORD the music".

(in the dominant American dialect; I don't recall about other dialects/countries)

idiocrat · on April 28, 2023

An interesting reprint in 2003

https://www.drdobbs.com/parallel/understanding-natural-langu...

"Computers still cannot understand natural language as well as young children can. Why is it so hard?"

Source: AI Expert, May 1987

LifeIsBio · on March 13, 2023

I haven't seen anyone mention Anvil[1] yet, but it lets you "Build web apps with nothing but Python." and is lovely tool that I've successfully used for a handful of side projects.

But as someone who feels most at home with Python, I always love to see new competition in this space.

[1] https://anvil.works/

anakaine · on March 13, 2023

EDIT: My following statement about self hosting is incorrect. You can, infact, self host.

This looks wonderful, but the inability to self host is a killer from the solo developer point of view. Being limited to 50,000 database rows on the free account isn't ideal.

meredydd · on March 13, 2023

Anvil has self-hosting! Just "pip install anvil-app-server" :)

https://anvil.works/open-source

(I'm a founder)

anakaine · on March 13, 2023

Well, I stand corrected! I was looking through the mobile site and didn't spot it. This makes me want to look a bit harder.

When self hosting, do you gain or miss out on paid features?

amlozano · on March 13, 2023

I totally thought Anvil had self-hosting. I was seriously considering it for my next project. Now not so much.

anakaine · on March 13, 2023

One of the founders corrected my statement. Apparently you can self host.

abraxas · on March 13, 2023

Nice! I haven't seen this one before. Will definitely take a look. Thanks for posting.

LifeIsBio · on Nov 9, 2022

I’ve run into this exact situation multiple times. Searching the whole history would be revolutionary.

emidoots · on Nov 9, 2022

FWIW you can do this with Sourcegraph today, searching over both diffs (code changes)[0] and commit messages[1]:

[0] https://sourcegraph.com/search?q=context:global+repo:%5Egith...

[1] https://sourcegraph.com/search?q=context:global+repo:%5Egith...

(I work there, just commenting on my own though. we're all pretty happy to have competition, more awareness of code search, etc.)

LifeIsBio · on Nov 3, 2022

I've found that people have wildly different definitions of "systems engineering", but this one is lovely. And also, very close to my own. ;)

https://jessimekirk.com/blog/whats_a_systems_engineer/

LifeIsBio · on Nov 2, 2022

I've been using Resh[0] for the past 6 months or so. A rich and queryable shell history is a massive boost in day-to-day productivity. The syncing described here is a pretty cool feature.

[0]: https://github.com/curusarn/resh

ddworken · on Nov 2, 2022

In addition, hiSHtory also supports fish for anyone who uses fish!