Hacker Newsnew | past | comments | ask | show | jobs | submit | what's commentslogin

They’ll just find a way to have $0 of profit. You have nothing to tax.

I just want to say please no. Comment chains where two people are arguing are the worst. Negative value. I’m always confused how they happen here, are people just refreshing their comments page waiting for a reply?

> Negative value.

You understand the irony here? Is the issue that other people are in flame wars or that you open with negative sentiment comments and people respond in kind?

I appreciate your comment because it is an opportunity for me to test the extension works correctly surfacing your comment with little notification. Thank you. :)


Internet validation is worth gold to the mentally destitute.

You must not leave the house. Emergency services responding to ODs is commonplace in SF. It happened at least once per week outside my office. Walgreens (while they were still open) ran audio ads in the store encouraging you to buy narcan.

> It sits next to the doctor helping them focus on you by transcribing the session, it doesn't do anything the doctor can't and definitely doesn't do anything the doctor SHOULD

You said the transcript isn’t available, only the notes/summary. The notes is what the doctor should do, the AI should only transcribe for the doctor’s review.

https://news.ycombinator.com/item?id=47895868


You said you evaluate the error rate every month. How can you do that if you don’t have the recording or transcript?

It’s not about creativity. The incentive to produce drops to zero when an LLM is just going to slurp it up and regurgitate it without some form of compensation (notoriety, money, whatever).

Which ever shitty model they’re using for search is so much better than the free offerings from the other companies. It’s not even close. It’s not going anywhere.

And this will get you like $1M at 45? You can’t retire on that.

$1.8M-$2.2M. Assumes 6%-7.5% annual return. Does not include employer contribution. Provides $72k-$88k /yr income. Assuming you pull social security at 67, your continued gains exceed your draw, and your fund perpetuates until you die.

If you retire at 45 won't that significantly impact social security?

It just means you draw ~$2500/month instead of ~$3800/month. That makes your $77k/yr income into $107/yr, but more importantly it helps your retirement account keep growing so it outlives you.

You can't live on $40,000 a year?

What about property taxes, the occasional $40k visit to the ER for a few stitches?

Does that happen often to you?

No - my hospital visits are £0 ;)

How close is your net worth and age to a million at 45?

Pretty bang on actually.

And how big is your dick too?


Bang average

I definitely could. An american maybe couldn't.

Where can I see the actual prompts and follow ups you fed each model?

So the prompts are tuned and adjusted on a per-model basis. If you look at the number of attempts, each receives a specific prompt variation depending on the model. This honestly isn't as much of an issue these days because SOTA models natural language parsing (particularly the multimodal ones) has eliminated a lot of the byzantine syntax requirements of the SD/SDXL days.

The template prompt seen in each comparison gets adjusted through a guided LLM which has fine-tuned system prompts to rewrite prompts. The goal is to foster greater diversity while preserving intent, so the image model has a better chance of getting the image right.

Getting to your suggestion for posting all the raw prompts, that's actually a great idea. Too bad I didn't think about it until you suggested it. And if you multiply it out - there's 15 distinct test cases against 22 models at this point, each with an average of about 8 attempts so we’re talking about thousands of prompts many of which are scattered across my hard drive. I might try to do this as a future follow-up.


Shouldn’t every model get the same prompt? Seems a bit weird, especially when you can’t see the prompts that were used.

The goal isn’t the prompt itself. The test is whether a prompt can be expressed in such a way that we still arrive at the author's intent, and of course to do so in a way that isn't unnatural.

The prompts despite their variation are still expressed in natural language.

The idea is that if you can rephrase the prompt and still get the desired outcome, then the model demonstrates a kind of understanding; however more variation attempts also get correspondingly penalized: this is treated more as a failure of steering, not of raw capability.

An example might help - take the Alexander the Great on a Hippity-Hop test case.

The starter prompt is this: "A historical oil painting of Alexander the Great riding a hippity-hop toy into battle."

If a model fails this a couple of times (multiple seeds), we might use a synonym for a hippity-hop, it was also known as a space hopper.

Still failing? We might try to describe the basic physical appearance of a hippity-hop.

Thus, something like GPT-Image-2 scored much higher on the compliance component of the test, requiring only a single attempt, compared with Z-Image Turbo, which required 14 attempts.


Why would you use an LLM for OCR?

Because if it's multimodal, oops all transformers and they're pretty much best in class for ocr now, afaik?

Yep, Its pretty damn good compared to classic OCR and even more lightweight ones as well that I can run locally. the cards just vary too much over time.

Because apparently that's what programming is and can only be these days...

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: