Hacker Newsnew | past | comments | ask | show | jobs | submit | Geniuzz's commentslogin

> Algorithms written in "pseudo-code", aka a higher level language without type information, are far more readable to a human, and thus likely an LLM too.

What’s the basis of this claim? There are many many more lines of code LLM’s are trained versus pseudo-code.

Also I agree, anecdotally the self-correction is key benefit from static types. If there is a mistake, it is caught at compile time and not at runtime.


It seems clear to me from first principles.

Humans are trained on human language. LLMs are trained on human language.

Thus something that is easier for a human to understand is likely easier for an LLM to understand.

That higher level language with well named variables reads more comprehensibly than code:VERB with:PREPOSITION types:NOUN, intermixed:ADJECTIVE, stems:VERB from:PREPOSITION first:ADJECTIVE principles:NOUN too:ADVERB


For models as complex as these I'm not confident we can apply arguments from first principles; we could just as easily argue that type information is helpful, from first principles. What is much more useful is empirical evidence, and AutoCodeBench [1] found that LLMs are most proficient in Elixir (dynamic) followed by Kotlin (static), with Rust and PHP at the bottom. So it would seem like, as of publication, typing style doesn't really matter!

[1] https://autocodebench.github.io/


As far as the AI is concerned, it's more like

Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo

versus

Buffalo:PN buffalo:N Buffalo:PN buffalo:N buffalo:V buffalo:V Buffalo:PN buffalo:N

I think the second one makes much more sense.


In the rare case that all your concepts use the exact same descriptive word, you are probably right!

The majority of the time you can infer the type from reading well written code (to the extent that the shape of the type matters in the context of that piece of code)


If the type can be inferred by the reader it should be inferred by the type system and at least be available to the LLM as a query. But we're also talking about dynamic languages in which type cannot be inferred until runtime. What's the type of x?

x = y + z

Well that depends on the types of y and z, which themselves may depend on the types of other operands, which themselves may not be known until the program actually runs. All that inference takes a lot of thinking, which takes tokens, which cost money. Why not just write the types down? Although we call these things "inference engines" they're really pattern matching explicit tokens, so it's better to actually write down the types so they can be pattern matched than to figure them out at inference time.


You are basically rehashing the false beliefs of the codeless programming camp. Human language that is 99% correct is a standing ovation for a speech writer while it is paying a cyber ransom as the software maker.


I don’t understand the negative sentiment here.

What’s the problem?

Do you think less money should be going into VC?

Just some numbers ~1.5M housing units are built in the US with an approx cost of $300k - $400k. That is $450B to $600B going into housing units construction every year.

On the other hand VC has maybe $1T AUM in the US. Maybe 10%-20% of that is deployed every year? So $100b to $200B.

What is wrong with that ratio? Could there be better solutions to make more housing cheaper? (lower regulations, efficient permitting, etc)

Money moving from VC to housing seems without a first principled approach on what problem your solving and how is silly.


The problem is they’re pouring insane amounts of money into non-problems. I use Git every day. There’s no problem with Git. Real problems that people suffer with everyday like healthcare and housing and even defense are doing so pitifully and we’re spending $17M on improving Git? If you don’t see the ridiculousness you really are in a bubble.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: