I think agreement has value here too. An LLM that's starting to get a bit sycophantic will rephrase your ideas in a few different ways, and seeing the different presentations is helpful for reconsideration.
From my own experience chatting with LLMs: reading the responses definitely does help with thinking, even when you can see obvious flaws or hallucinations. It gives you something to think about, which a rubber duck can't do.
> The hilarious part, though, is that it's not the AI that's working around the rules. That's the scenario that's been in science fiction, but it's not what's happening. It's the human users making use of our agency to get the AI agents to work around the rules. Despite calling them "agents", current AI agents don't seem to be able to that particular something. Yet, at least.
Well, yes. Until people are putting the LLMs into actual mechanical robots, "agency" boils down to flipping bits in memory or storage (even if they're ones that humans consider really important, e.g. because they represent a bank ledger) or convincing humans to take action. One can only "work around the rules" to the extent that one can "work".
But even in Asimov's books, at least some of the scenarios involved humans misleading the robots to use them as pawns in a greater scheme.
Many, many years ago I was asked to implement a filter like that for usernames. I said right away that it wasn't going to work well, but I did implement it.
Next internal build, the CEO can't create an account. With his real name.
It worked exactly to spec; I added a debug print and showed everyone the "bad word" it tripped on. The idea was promptly rethought.
Now I'm trying to figure out which word that would be, but yeah.
That reminds me of a bug I fixed where my bosses boss found it, we did everything, my boss at the time forced us to deploy anything and call it fixed. Then someone else saw it half a year later, I finally figured out the root cause and fixed it (localStorage vs sessionStorage) and my boss was acting like he didn't know what I was talking about, but I could hear it in his voice. I didn't press too hard, I just pushed the real fix out. It was basically a "client-side" bug of a gift card balance saved in localStorage that never updated, so I changed it to sessionStorage. Not quite the CEO, but the guy below the CIO finding a bug can worry just about anyone.
In my case, the regex would have been for a friend to filter reddit or discord slurs, so not as awful.
Two of my co-workers have the last names Dyck and Cox. I've seen others whose last name is literally Dick. And let's not forget the famous actor Dick Van Dyke who strikes out twice on most filters. I've heard several other names from other ethnicities that were straight up "slurs" by some people's standards. The only thing harder than matching a slur is deciding what words count as slurs.
I think I'm not getting something here. Like, sure, the refused prompt "review the code for security issues" could be interpreted as an attempt to discover weaknesses in a running system to exploit them. But we don't generally assume humans are doing something wrong if they are "reviewing code for security issues", and would commonly see no problem with asking each other to do so.
The problem is that a patch to fix a security issue quite often also shines a spotlight on the issue being fixed. Fixing a part of something like this super complicated Project Zero post might not give much of a clue as to what the issue was or how to exploit it: https://projectzero.google/2021/12/a-deep-dive-into-nso-zero...
But that's the exception. Most fixes to security issues point a finger directly at the issue, make it relatively obvious how to exploit, and generally doesn't take long to figure out from there what you might get out of it.
This has been a problem for a long time but AIs have made it even worse. It is now cost effective for a well-resourced attacker to simply monitor the patch stream of an important project like the Linux kernel or nginx and pass every single one through an AI with the question "Is this a vulnerability and if so how would I exploit it?" It has seriously complicated the process of getting fixes to people before the attackers have a chance to exploit it, just as AIs have also been increasing the rate at which serious security issues that have been found also need to be patched. Previously they could at least sneak a patch in under an innocuous commit message and have a reasonable chance of being lost in the churn, but now that door is increasingly closed to them as well.
And this is for the case when a security fix lands in the stream of a project and someone externally is watching it with no context. If you also get the complete stream of Mythos finding and fixing the bug it is even easier.
So, yes, any security vulnerability that Mythos will "fix" is also one that it first has to find, and the guardrails are useless if you can just instruct Mythos to "fix" it. And on the flip side, if Mythos won't fix security bugs, and we project that out to all other models matching this behavior, this will create a world in which the good guys can't secure their code but the bad guys, who will one way or another get around the guard rails if by nothing else simply by stealing the model and modifying it to suit their needs, will be able to break this code that we're not being "allowed" to secure. Since fixing vulns is a subset of finding the vulns, there isn't a way to "fix" this. Any model that can fix vulns must, by necessity, be able to find them. And it is the fixing we really need to be spread far and wide to secure the world's code.
>pass every single one through an AI with the question
Unfortunately this will just involve said teams running their patches over AI first before they're put in the main branch. For businesses it will probably be fine, but would get very expensive for open source projects.
The webpage linked is an example of everything I wish people would stop doing in web design.
Fortunately, at the bottom there is a link to the "technical documentation" (https://squeezlabs.github.io/handcrank/) which is vastly improved (aside from being light-mode-only and linked from a dark-mode-only marketing page). It also gives me much more interesting information (specifically: models that can apparently run acceptably on a Pi 5).
Please let me read your content with a scrollbar that works the way scroll bars are supposed to, rather than turning everything into a weird slide show where you don't actually know when the next slide is coming. Please let me just click on buttons that look like links to more information, without JavaScript.
Why can't technical people appreciate that us, the silent majority, love having our scroll hijacked? I can't remember the last time I used a scroll bar to navigate a website, but using it to navigate between choppy javascript keyframes fills me with joy.
Scroll animations, post-grid floating voids, bouncy house dampening, hyper rounded... everything. These are the 50s Chevy fins of today.
I've enjoyed working with some great designers over the years, Stanford D-School and even wild-raised. All the good ones intuitively steered clear of trends destined to be era-stamp tropes. They'd say, "I can already hear the ghosts of design-future mocking me: 'That's so early-AI' and 'Yo, the mid-20s called and wants their bento grid back.'"
Thirty years ago, Apple made a translucent green ADB "keypad" which had a small LCD display (perhaps only two lines of text?) – marketed towards academics, it allowed students to learn touch-typing without the distractions of an entire computer.
Once you were happy with your touch-typed document, you then plugged the "keypad" directly into your Mac's ADB (keyboard/mouse) port... and the thing would sit there and manually re-type your composition into the computer's texteditor.
----
Education needs such "reduced tech" to return to teaching. Think of this one as a "more advanced typewriter" – although I own a few of those, too, and they're fantastic for pure composition.
I don't use them, but that is surprising! I would program one of my theoretical phone's physical side buttons to handle PgDn/PgUp [†] – similar to my old Kindle's layout. Do phones still have side volume buttons (e.g.)?
[†] Thanks for the better styling, than my former Page_Up &c
No doubt non technical people have different UX experience than tech nerd, but I have seen plenty of "normal" people curse at artsy fluffy design, that made known navigation skills useless and nobody likes their time wasted.
I agree this type of web design sucks. It's been common for more than a decade - I remember Apple getting criticized for using this on the product page for the old "trash can" Mac Pro in 2013, and it was already widely used back then.
However, it seems pretty clear to me they did this in service of a joke - you have to "crank" your scroll wheel to get to the content, just like you have to crank this device. I think it's funny...
Great prop for a Black Mirror episode about AI use in a post-apocalyptic world. Everywhere you go, all you hear is brrrrr..brrr..brrrr followed by people mumbling.
Totally agree on the atrocious landing page. The technical one is much better, although the power supply circuit by using a resistive balancer and a linear regulator wastes some good power for nothing.
yea i can't stand this. im not so boomer i want every webpage to be like. times new roman white background and just using <p></p> and bulleted lists, but idk i cant even put a finger on what im not enjoying here. think it's possibly using scrolling as a way to try and force me to read through stuff. jokes on them, i can't read. not giving me the agency to click around into info that interests me drives me nuts, chances are im just gonna keep scrolling at 1000mph and eye scan until i see what im looking for virtually zero chance im going to sit through the experience of every carefully designed scroll-slide they've tried to present to me here.
Alright, I'll be the boomer and say that's what I want every webpage to be like. If you want to customize it you can bring your own CSS or download someone else's. The modern web is a nightmare of user-hostile time-thieving behavioral manipulation and our brains would be better off without it.
is another noted LLM-ism.
reply