The other day I asked AI to one-shot an implementation of hyperbolic trig functions for double-double floats.
I provided a repo (mine) that already implemented double-double arithmetic, trigonometry, and logarithms/exponentials, with plenty of tests.
It produced something that looked this good. It had tests, it followed the style of the existing code base, etc. But it was full of shit and outright lies.
After I reviewed it to fix deficiencies, I don't think there was anything left of the original.
I had much more success the previous week using an AI to rubber duck the algorithms to implement trig.
I am incredibly sceptical that just adding more loops — and less critical thinking/review — to brute force through a solution, is a good idea.
If you’ve worked on a code base built by more than you, you don’t understand and you don’t have control. Part of being an experienced engineer is understanding how to deal with that effectively at scale.
It's an app that uses NFC or, if needed, reads a QR code and does a web request (i.e. needs internet).
Neither Google nor Apple will block that, or take a cut; and it's already available in multiple markets.
This is about taking stuff that already works in one or two countries, design a similar system that works across countries, and mandate that all banks under ECB supervision implement it.
Digital Markets Act, also Apple nearly lost their payment monopoly in Germany as powerful banks lobbied for a law forcing them to open up. It was passed, but then they didn't want to use it. If I would guess, Apple offered them preferential conditions to not have a precedent.
Lack of negotiation power. Less control over Android than Apple has over iOS.
Google keeps self-sabotaging Android Pay. They lacked market power so cellular carriers blocked it hoping to advance their own payment ecosystem (ISIS). Google changes the payment brand every few years, and fragments it into two separate apps or combines them. It's rather like their messaging strategy.
Just this month I've burned through 80% of my Copilot quota of Claude Opus 4.6 in a couple of days to get it to help me with a silly hobby project: https://github.com/ncruces/dbldbl
It did help. The project had been sitting for 3 years without trig and hyperbolic trig, and in a couple days of spare time I'm adding it. Some of it through rubber ducking chat and/or algorithmic papers review (give me formulas, I'll do it), some through agent mode (give me code).
But if you review the PR written in agent mode, the model still lies to my face, in trivial but hard to verify ways.
Like adding tests that say cosh(1) is this number at that OEIS link, and both the number and the OEIS link are wrong, but obviously tests pass because it's a lie.
I'm not trying to bash the tech. I use it at work in limited but helpful ways, and use hobby stuff like this as a testbed precisely to try to figure out what they're good at in a low stakes setting.
But you trust the plausibly looking output of these things at your own peril.
But if you setup CI, you can pick up the mobile site with your phone, chat with Copilot about a feature, then ask it to open a PR, let CI run, iterate a couple of times, then merge the PR.
All the while you're playing a wordle and reading the news on the morning commute.
It's actually a good workflow for silly throw away stuff.
I provided a repo (mine) that already implemented double-double arithmetic, trigonometry, and logarithms/exponentials, with plenty of tests.
It produced something that looked this good. It had tests, it followed the style of the existing code base, etc. But it was full of shit and outright lies.
After I reviewed it to fix deficiencies, I don't think there was anything left of the original.
I had much more success the previous week using an AI to rubber duck the algorithms to implement trig.
I am incredibly sceptical that just adding more loops — and less critical thinking/review — to brute force through a solution, is a good idea.
reply