More

CraigRood · 2026-01-10T14:59:24 1768057164

Windy.app looks good visually, but once you start using it, the UX is all over the place. Always find it frustrating.

chrisweekly · 2026-01-10T17:29:33 1768066173

Interesting! I used it heavily for years as a sailor and found it intuitive. Clearly, YMMV. :)

CraigRood · 2025-11-04T12:41:13 1762260073

It's less about paint a picture yourself, arguably there is little to no value there. OpenAI et al, sell the product of creating pictures in the style of their material. I see this as a direct competition to Studio Ghibli's right to produce their own material with their own IP.

throwacct · 2025-11-04T13:09:25 1762261765

I agree with this. I don't know how to create artistic styles by hand or using any creative software for that matter. All the LLM tools out there gave me the "ability" and "talent" to create something "good enough" and, in some cases, pretty close to the original art.

I rarely use these tools (I'm not in marketing, game design, or any related field), but I can see the problem these tools are causing to artists, etc.

Any LLM company offering these services needs to pay the piper.

CraigRood · 2025-10-20T11:28:38 1760959718

I have a thought that whilst LLM providers can say "Sorry" - there is little incentive and it will expose the reality that they are not very accurate, nor can be properly measured. That said, there clearly are use cases where if the LLM can't a certain level of confidence it should refer to the user, rather than guessing.

Rudybega · 2025-10-20T21:25:02 1760995502

This is actively being worked on my pretty much every major provider. It was the subject of that recent OpenAI paper on hallucinations. It's mostly caused by benchmarks that reward correct answers, but don't penalize bad answers more than simply not answering.

E.g.

Most current benchmarks have a scoring scheme of

1 - Correct Answer 0 - No answer or incorrect answer

But what they need is something more like

1 - Correct Answer 0.25 - No answer 0 - Incorrect answer

You need benchmarks (particularly those used in training) to incentivize the models to acknowledge when they're uncertain.

CraigRood · 2025-09-25T06:18:32 1758781112

I'm confused by your statement as the article suggests you can get a boarding pass via email.

CraigRood · 2025-08-26T11:42:34 1756208554

I don't think users understand the risks. I'm broadly accepting of the protection of end users through mechanisms. Peoples entire lives are managed through these small devices. We need much better sandboxing to almost create a separate 'VM' for critical apps such as banking and messaging.

CraigRood · 2025-08-24T15:33:41 1756049621

The whole notion of "Vibe Coding" was to accept the output regardless and prompt forward. Anything else is moving the goalposts. If you can't accept the outputs and you need an in-depth knowledge of code then these LLMs are not ready for this task.

CraigRood · 2025-08-16T19:01:02 1755370862

Awesome post and fun read given I'm a PureGym member myself.

I 'got around' the PIN/QR Madness after 1 week by getting key fob. Now I don't have to ever open the app...

Attendance API looks to be worth playing with! Nice Bonus.

CraigRood · 2025-08-11T16:40:53 1754930453

I run Gitea too - Seeing what is happening over at GitHub solidifies my decision.

Not too concerned over my public facing repos, Amazon and OpenAI seem to love 'em! I have the ultimate control over my private repos (nothing juicy). I can't say I trust Microsoft not to do something I don't like at any point in the future.

Edit: I should say I wish phabricator got more love, that was a great tool!

Catbert59 · 2025-08-11T16:44:19 1754930659

My private repos would contaminate the Copilot LLM for life. Haha.

Have fun Microsoft getting that foul apple out of your system again.

CraigRood · 2025-08-07T09:21:13 1754558473

I was playing with it yesterday and every single session gave me factually incorrect information.

Speed and ease of use is one thing, but it shouldn't be at the cost of accuracy.

OliverGuy · 2025-08-07T10:33:11 1754562791

If you are trying to get facts out of an LLM you are using it wrong, if you want a fact it should use a tool (eg we search, rag etc) to get the information that contains the fact (Wikipedia page, documentation etc) and then parse that document for the fact and return it to you.

CraigRood · 2025-08-10T18:09:33 1754849373

These tools are literally being marketed as AI, yet it presents false information as fact. 'using it wrong' can't be an argument here. I would rather then tool is honest about confidence levels and mechanisms to research further - then feed that fact back into 'AI' for the next step.

CraigRood · 2025-07-29T12:34:30 1753792470

Thing is, even with users that don't use the quota, these AI companies are still losing money. This isn't a case of the small users paying for large. The true costs of AI are yet to unravel.