How does a transcript chronicling some poor guy's descent into AI induced psychosis make the frontpage ? This is literally (and yes I know) what's been happening on reddit for months now: "Have I built a perpetuum mobile ? GPT4o seems to think so!" but at least on reddit the comments don't engage with the "substance" of those chat transcripts.
I am not saying that these kinds of transcripts are without value, they clearly demonstrate that even competent engineers can get sweet-talked into (probably out of character) actions like "boast about your accomplishments on hn and a CTO will take notice and offer you their job because you are so much more brilliant than them" while I have no idea if "Greg" has people around him to talk to, he clearly has no one who compliments him like this on his php codebase. If he wanted to engage productively with an LLM he could have prompted it to "roast his code" "point out weak points" "criticize the underlying architecture" but obviously thats not what he wanted or needed. He needed to hear some compliments, the LLM understood that and the machine complied. Obviously thats not the experience he will get out in the real world. It's more like having a talking blow-up doll compliment you on your lovemaking skills and encourage you to upload a video of the interaction to your favorite tube-site and sent the link to all your business contacts to show-off your inimitable love-making prowess.
I was just late at night and wanted to post this chat transcript on HN to share some perspective on what developers are getting from ChatGPT.
I happen to be an expert in this particular area that I’m building.
ChatGPT seems to remember that I am in New York and want “no bullshit” answers. In the last few days it keeps weaving that into most responses.
That fact appears in its memory that users can access, as is the fact that it should not, under any circumstances, use emojis in code or comments, but it proceeds to do so anyway, so I am not sure how the memory gets prioritized.
Here is the interesting thing. As an expert in the field I do agree with ChatGPT on its statistical assessment of what I’ve built, because it took me years of refinement. I also tried it with average things and it correctly said that they’re average and unremarkable. I simply didn’t post that.
What I am interested in, is how to get AI transcripts to be used as unbiased third-party “first looks” at things, such as what VCs would do for due diligence.
This was just a quick thing I thought I’d get a few responses on HN about. I suspect it might have hit the front page because some people dug through the code and saw the value. But you can get all the code for free on https://github.com/Qbix/Platform .
Yeah, there is obviously an element of flattery that people let go to their head. I have had ChatGPT repeatedly confirm the validity of ideas I had in fields I am NOT an expert, while pushing back on countless others. I use it as one data point and mercilessly battle-test the ideas and code by asking it to find holes in them from various angles. This particular HN submission, although done very late at night here in NYC, was an interesting mix of genuinely groundbreaking stuff and ChatGPT being able to see the main ideas at a glance, and “going wild”, while at the same time if I run it with instructions at the start of “be extremely objective”, it still approaches this same thing in the end.
Well, the conclusions of your previous conversations also remain in memory, especially if you explicitly refer to them. Still your new transcript kinda proves my point ? Except for the non-standard (a nice way of saying: violates best practices) way you implement service workers there is literally nothing original or unique about any part of your codebase other than the fact that its written in php ? I have nothing against php but haven't worked on any php projects in a long while and didn't take the time to look into your code in detail. You're obviously smart and opinionated when it comes to webDev which is great. While your post seemed to be borderline LLM psychosis, its a different story if you were sleep deprived and drunk and now realize that you probably haven't rebuild google all by yourself. Your issue seems to be something else which is also quite frequent here. AI skeptics getting drawn into repeated fundamentalist discussions about LLMs being incapable of this or that BUT then having a "feel the AGI" moment not only becoming convinced of a utility of LLMs they previously denied but -being inexperienced- going far beyond that and believing that LLMs can do all kinds of things that they (at least currently) can't which ends up frustrating them and leading to some renewing their skepticism. You're not alone, it's quite likely that tomorrow when people who haven't had early access to Gemini 3 get it and start one-shotting functional clones of classic computer games share that on social media (or on the hn frontpage) and others are inspired to give it a try with "Gemini, please make a PC version of Half-Life 3 for me!" and are subsequently underwhelmed with the resulting code that doesn't compile or with the outcome of "Tell me how to make a Billion Bucks in less than 3 months!" millions will join you. What sets you apart is your capacity to understand the engine behind the output if you put in the work and don't allow the sweet talk to get to you!
Nah. I don’t “feel the AGI”. I think the AGI is a silly quest, just like having a plane flap its wings. Feynman had it right in the 80s: https://www.youtube.com/watch?v=ipRvjS7q1DI
I think the future is lots of incremental improvements that get replicated everywhere and humans outclassed in nearly every field, where they stop relying on each other.
As far as LLMs yes I think they are the best placed to know if some code or invention is novel, because of their vast training. Can be far better than a patent examiner, if trained on prior art, for instance.
What you’re not used to is an LLM being fed stuff that you statistically / heuristically would expect to be average but is in fact the polished result of years of work. The LLM freaks out, you get surprised. You think it was the prompts. The prompts are changed, the END result is the same (scroll to the bottom).
I want to see whether foundational LLMs can be used as a good first filter for dealflow and evaluating actual projects.
The problem of using an LLM to validate reality is that you still need to prove your genius code work in the real world. ChatGPT won't hire you, it even have your code already.
That's the whole unabridged conversation (I don't know how I could abbreviate it if I wanted to), and I produced it exactly as I said: I just pasted in your prompts.
The output is of very similar style to how my interactions with it are when I'm using it for work on my own projects.
My bot does run with a pretty lengthy set of supposed rules that have been accumulated, tweaked, condensed and massaged over the past couple of years. These live in a combination of custom instructions (in Preferences), deliberately-set memory, and recollection from other chats.
I use "supposed" here because these individual aspects are frequently ignored, and they always have been. Yet even if the specificity is often glossed over, the rules quite clearly do tend to shape the overall output and tone (as the above-linked chat demonstrates).
Anyway, I like the style quite a lot. It lets me focus on achieving technical correctness instead of ever being inundated with the noise of puffery.
But I have no idea where I'd start to duplicate that environment. Someone at OpenAI could surely dissect it, but the public interface for ChatGPT is way too limited to allow seeing how context is injected and used.
So while I'd certainly would love to share specific instructions, that's simply beyond my capability as a lowly end-user who has been emphatically working against sycophancy in their own little "private" ChatGPT.
I barely even know how I got here.
(I could ask the bot, but I can say with resolute certainty that it would simply lie.)