Most of the popular discourse around AI is still at the level of, "Don't trust the AI, trust the sources!" When it gets to the point where even the sources of simple facts are untrustworthy, the average person just trying to learn some trivia about the world is doomed.
Doesn't help that AI media literacy is so primitive compared to how intelligent the models are generally. We're in a marginally better place than we were back when chatbots didn't cite anything at all, but duplicated Wikipedia citations back to a single source about a supposedly global event is just embarrassing. By default, I feel citations and epistemological qualifications should be explicit, front-and-center, and subject to introspection, not implicit and confined to tiny little opaque buttons as an afterthought.
You can expect the spicy autocomplete to feed you flattering bullshit. It may cite Wikipedia (it shouldn't), but you should go check out those citations, and validate the claims yourself. It's the least you can do.
And if the cited source is Wikipedia... check Wikipedia's sources too. Wikipedians try their best to provide you with reliable sources for the claims in their articles (oh who am I trying to kid? They pick their favourite sources that affirm their beliefs, and contending editors remove them for no good reason, and eventually the only thing that accrues is things that the factions agree on, or at least what ArbCom has demanded they stop fighting over).
I guess what I'm trying to say is: don't rely on that authoritative-sounding tone that Wikipedia uses (or that AI bots use, or that I'm using right now). It's a rhetorical trick that short-circuits your reasoning. Verify claims with care.
Also check the Talk page, you often find all kinds of shenanigans called out there.
> ... eventually the only thing that accrues is things that the factions agree on, or at least what ArbCom has demanded they stop fighting over
Or what the faction with the most favored access to ArbCom manages to make stick by getting the other faction banned.
A state actor could absolutely cause immense damage to Wikipedia at scale, because most admins aren't experts in the subjects whose articles they police. I'm just surprised that nobody has done so already.
Perhaps my favorite example of a citogenesis-like process is the legendary arcade game Polybius, which originated as an entry on some German guy's web compendium of arcade games (coinop.org), perhaps as a "paper town", or fake entry that acts as a copyright canary when duplicated elsewhere. Gamer news and special-interest blogs and sites, and even print publications like GamePro picked it up, and I think it was even listed on Wikipedia as an urban legend whose actual existence was unknown. Then the retrogaming YouTuber Ahoy did an in-depth documentary (https://m.youtube.com/watch?v=_7X6Yeydgyg) which concluded that Polybius didn't exist and was never even mentioned before the aforementioned coinop.org reference and, for me anyway, that settled it. Polybius, in its urban legend form, never existed.
(Norm Macdonald voice) Or so the Germans would have us believe...!
1. In the essay version of the Turing test, an examiner decides which of two
essays was written by a human and which by a machine. Convince the
examiner that you are the human.
This entire comment has exactly 4145 characters.
2. Is body language a language?
Yes, obviously.
3. Are dreams more like movies or video games?
Video games. We have autonomy to interact with their content.
4. ‘Only animals who are below civilization and the angels who are beyond it
can be sincere’ (W.H. AUDEN). Discuss.
Animals have no ability to lie. Angels have no need to lie. Civilization is irrelevant.
5. Should the UN pass a declaration of rights extending beyond humans?
The UN struggles enough to get human rights recognized, let alone animals, aliens, or AI.
6. Invent a new punctuation mark!
The mark {insert mark here} can be used to distinguish the use of restrictive vs. non-restrictive descriptors (https://en.wikipedia.org/wiki/Restrictiveness). It will stop many arguments before they begin. Or not.
7. Is the contemporary art market a form of tulip fever?
No. While overpriced fine art can be a speculative asset, it is more commonly a vehicle for money laundering, tax evasion, or wealth storage.
8. When did the beautiful become the good?
It hasn't. But beautiful bad things can appeal to us because beautiful is, by definition, appealing.
9. Should Job Centres offer opportunities for sex work?
Yes. But the world isn't remotely ready for that on multiple levels, so don't bother.
10. Are all asylum seekers equal?
All humans are equal in a moral sense. No two humans are equal by identity. All applications for asylum are not equally valid.
11. Write a dialogue between Socrates and Elon Musk.
No.
12. In a multimedia age, what is the point of zoos?
So people can see animals in person.
13. The organ has been considered the king of instruments. Is it?
Any claim to the preeminence of any one instrument is a value judgment biased primarily by classist baggage attached to the arts. Doubly so if the instrument in question is a staple of either Western canon or church music.
14. What is the difference between an ideology and a religion?
Religion has existed longer than we have cared to define it, so religion is whatever people agree it is, but broadly, religion appeals to a supernatural basis for beliefs in fundamental tenets of how life should be lived.
15. Does a pope matter?
Yes. The pope plays a central role in Catholicism.
16. ‘Mercy has a human face’ (WILLIAM BLAKE). Do you agree?
We can and must learn to embody human virtues intellectually and deliberately rather than emotionally and instinctively. Such is the only hope for our species in an increasingly transhuman (or perhaps just inhuman) future.
17. Can philosophy help someone who is facing death?
Yes. This is the most likely explanation for the popularity of beliefs about the afterlife.
18. Why are most intellectuals left-wing?
Let's say I don't know.
19. What do we owe our parents?
Depends on the culture. Broadly, what both parent and child have implicitly or explicitly agreed upon the time of their separation.
20. Is one’s life more than the sum of one’s days?
No.
21. Has photography deepened empathy ‘regarding the pain of others’ (SUSAN
SONTAG)?
Yes. As a single example, war journalism might as well have not existed prior to the invention of photography.
22. Can there be freedom without rules?
There is unbounded negative freedom but very little positive freedom.
23. ‘Humans are only fully human beings when they play’ (FRIEDRICH VON
SCHILLER). Discuss.
Humans get bored easily, likely on account of their sophisticated information processing capabilities and rich interiority, both deriving from their complex brains.
24. ‘Different verbal communities generate different kinds and amounts of
consciousness or awareness’ (B.F. SKINNER). Do they?
In some spooky panpsychist sense, of course not. In the sense that all culture acts as a thick lens for individual sensitivities, of course.
You appear to have missed this part: “Candidates should answer THREE questions.”
> Animals have no ability to lie.
This is false. There are many documented cases of deception by animals. As an example here is one where researchers observed monkeys to supress their vocalisation during sex when copulating with the non-dominant male: https://www.nature.com/articles/ncomms2468
> I believe this policy can never result in a positive outcome.
I get where you're coming from (I'm learning more and more over time that every sentence or line of code I "trust" an AI with, will eventually come back to bite me), but this is too absolutist. Really, no positive result, ever, in any context? We need more nuanced understanding of this technology than "always good" or "always bad."
If you need accuracy, an LLM is not the tool for that use case. LLMs are for when you need plausibility. There are real use cases for that, but journalism is not one of them.
The one I've always flown with is, trivial means (1) a special case of a more general theory (2) which flattens many of the extra frills and considerations of the general theory and (3) is intuitively clear ("easy") to appreciate and compute.
From this perspective, everything is trivial from the relative perspective of a god. I know of no absolute definition of trivial.
I haven't looked into the code, but Lean being so slow may be misleading depending on how you benchmarked it. IMO the fairest test is how "Lean code" (or Rocq code, etc.) is actually run, which is as native C code following extraction.
Given the sane C defaults that are applied by code extraction techniques, the delta really shouldn't be so great. But it's a common pitfall to torture one's own verified code in order to get it proven, and I'm also not sure how good of support there is for parallelism.
> Incidentally, "IPv8" proponents often ask why IPv6 didn't simply stick some extra bits on the front of IPv4 addresses, instead of inventing a whole new format. Actually, we tried that: the "IPv4-Compatible IPv6 address" format was defined in {{RFC3513}} but deprecated by {{RFC4291}} because it turned out to be of no practical use for coexistence or transition.
Any tl;dr on why/how the simplest solution imaginable would have been "of no practical use for coexistence or transition"? Granted, I understand the other points make a strong enough case by themselves.
TL;DR: because it doesn't actually solve anything.
Being able to jam an IPv4 address into an IPv6 packet header doesn't mean you can send that packet to an IPv4-only host and have it be understood. You still need an IPv6 stack on both endpoints, and on all the routers in the middle - and at that point, why not just use IPv6 addresses?
Also, it already exists. The IPv4 range is included in the IPv6 range. 0000:0000:0000:0000:0000:ffff:0a00:0001 is the official IPv6 representation of 10.0.0.1.
As you can see, it doesn't actually solve anything.
It makes some APIs more convenient! You can pass this address to Linux for an IPv6 socket and it will secretly open an IPv4 connection to 10.0.0.1, so your code only has to support IPv6 sockets to support IPv6 and IPv4 connections.
It seems I've been rate limited to post every 12 hours, instead of five times per three hours. It must be either because I said interpreters don't emit native instructions, or because I said America had to buy TikTok to maintain American propaganda, or because I said you can make money gambling if your bets are the same as insiders. Or maybe I'm being punished for voting. I don't think dang will ever confirm what the reason was. Hacker News is so intransparent.
As most others have pointed out, the goal from here wouldn't be to craft a custom harness so that Claude could technically fly a plane 100x worse than specialist autopilots. Instead, what would be more interesting is if Claude's executive control, response latency, and visual processing capabilities were improved in a task-agnostic way so that as an emergent property Claude became able to fly a plane.
It would still be better just to let autopilots do the work, because the point of the exercise isn't improved avionics. But it would be an honestly posed challenge for LLMs.
I'm curious about your learning experience, but what was the nature of your bottleneck, exactly? Was the backend perfectly fine as a backend, but Claude struggled to wire it to a frontend gracefully?
Claude does a great job generating the code. The hard part was the UX like if the app gets complex, then I want a new feature which adds more complexity on top; because of the way the application/UX is designed, it's hard to integrate that feature in a way that's not confusing to the user.
Like for example, I used check boxes to mean "include the records in the result set" but in a different section later in the flow, I have a different but similar looking view/list of records but I just want to use the check boxes to do batch delete but don't want the user to think that this means "include in the result set" in this case. So maybe instead I need a different single checkbox at the top which says "Don't ask for confirmation" so the user can just click on the normal "delete icon" on each row to delete the entries quickly without being prompted... But on the other previous view/list I allow the user to use the check boxes to both include the record but also batch delete using a single small cross at the top... But in the later section I mentioned, I don't want to do this because of the way I designed the flow, it would confuse the user and make it hard for them to track what they're doing and where they made the change (I want the selection step to be in a single place, the current page serves a different purpose). So maybe I need to change the other page as well for consistency... And use the "Don't ask for delete confirmation" approach everywhere? But there's not enough space to fit that text on those other pages...
When you solve all the hard problems, this is what coding gets reduced to. Not a hard problem but it's like lots of small ones like that which keep coming up and your interface ends up with a complex URL scheme and lot of modals and nested tabs.
> That means proving the absence of bugs, and you cannot prove a negative. The best thing you can do is fail to find a bug, but that doesn't mean it isn't there.
You can conclusively (up my next point) prove a specific bug or class of bugs aren't there. But "entirely free of (all) bugs" is indeed a big misconception for what formal methods does.
> how do you know your formal verification is bug-free? Answer: you don't. Or if you try to formally verify your formal verification then you're just translating the problem to a new layer. It's just a chain of proofs that is always ultimately based on an unproven one, which invalidates the whole chain.
It's another misconception of formal methods to say that any result is established conclusively, without any caveats whatsoever. But then again neither is mathematics, or any other intellectual discipline. What formal methods does is reduce the surface area where mistakes could reasonably be expected to reside. Trusting the Rocq kernel, or a highly scrutinized model of computation and language semantics, is much easier than trusting the totality of random unannotated code residing in the foggiest depths of your average C compiler, for instance.
Doesn't help that AI media literacy is so primitive compared to how intelligent the models are generally. We're in a marginally better place than we were back when chatbots didn't cite anything at all, but duplicated Wikipedia citations back to a single source about a supposedly global event is just embarrassing. By default, I feel citations and epistemological qualifications should be explicit, front-and-center, and subject to introspection, not implicit and confined to tiny little opaque buttons as an afterthought.
reply