"And all existing proof languages are hopelessly mired in the obtuse and unapproachable fog of research debt created by the culture of academia."
Yes. As I wrote 40 years ago:
"There has been a certain mystique associated with verification. Verification is often
viewed as either an academic curiosity or as a subject incomprehensible by mere
programmers. It is neither. Verification is not easy, but then, neither is writing reliable
computer programs. More than anything else, verifying a program requires that one have
a very clear understanding of the program’s desired behavior. It is not verification that is
hard to understand; verification is fundamentally simple. It is really understanding
programs that can be hard."[1]
What typically goes wrong is one or more of the following:
1) The verification statements are in a totally different syntax than the code.
2) The verification statements are in different files than the code.
3) The basic syntax and semantics of the verification statements are not checked by the regular compiler. In too many systems, it's a comment to the compiler.
4) The verification toolchain and compile toolchain are completely separate.
None of this is theory. It's all tooling and ergonomics.
Then, there's
5) Too much gratuitous abstraction. The old version was "everything is predicate calculus". The new version is "everything is a type" and "everything is functional".
I am sure we can do better than predicate logic or types, but the code needs to change also. Code should be more abstract and reusable. If possible, it should be a subset of the logic. There is no point in verifying the same stuff in JavaScript, Java, Rust, Swift, ...
Furthermore, something that looks simple might need a lot of abstraction to become provable. Something might be simple to write down in code, but in order to come up with a notion of correctness for it, and actually prove it, might require something significantly more complicated.
Infatuation with Hoare-Logic is part of the problem (that's what the paper you link uses, I think?). Thinking that verifying programs is easier than doing proper math on the computer is a dead end. First get mathematicians to actually like doing proofs with the help of a computer. THEN you might have a chance of tackling more practical applications like program verification without using an insane amount of resources.
Verification is largely a fool's errand[1]. The juicy opportunity is tooling that aids in constructing programs such that they necessarily have the desired properties. Predicate transformer semantics are up to the task. It's basically just applied lattice theory, which is a very well understood field of math and within the learning capacity of any competent programmer.
Edit: Automated verification does however attract a lot of research money as the latest in a long list of fads promising the bean counters that they can hire cheap idiot programmers instead of expensive smart ones. I don't mean to be dismissive though, the automatic verification research is genuinely interesting and they have accomplished impressive things, for example[1].
Automated verification is not the latest in a long list of fads. First, it is not a fad, and second, it has been around for a long time (check for example [0], which dates from 1961).
I agree with you that constructing programs such that they necessarily have the desired properties is the way to go. But we will disagree in how to go about that. Predicate transformer semantics is just another name for Hoare-Logic, and is too narrowly focused on the program code itself. It is basically the opposite of correct by construction! Rather, I would like to see programs as natural outflows / consequences of verified mathematical theories.
> Predicate transformer semantics is just another name for Hoare-Logic,
Sir Tony is certainly highly influential, but Hoare triples are a considerably more basic. Not to understate their importance, having a solid base to build on is essential.
> and is too narrowly focused on the program code itself. It is basically the opposite of correct by construction!
Please provide some explanation so that this doesn't read as an absurdity.
What I mean by my comment is that typically with Hoare-style verifications, you add invariants to the program AFTER it has been constructed. So in this sense it is not correct by construction, but rather verified after construction. Instead, under correct by construction I would understand an upfront proof about some abstract function, and then the concrete code generated for this function will be automatically correct.
> What I mean by my comment is that typically with Hoare-style verifications, you add invariants to the program AFTER it has been constructed.
Ah, perhaps some people misuse predicate transformers that way, I don’t know. I do know however that that’s not how their inventors used them or meant for them to be used. The basic idea is that you define the postcondition you’d like to establish before writing any code. Then you derive a chain of predicate transformers such that the final precondition is suitable. A common choice is the predicate that is true for all states, but for partial correctness another might be used. Since any given predicate transformer has a syntactic representation, the result is the program text. As you can see the result is correct by construction.
I recommend reading Dijkstra’s A Discipline of Programming for pragmatic examples of using predicate transformers to derive correct programs. If you want a more rigorously formal treatment of the same subject there’s Dijkstra and Scholten’s Predicate Transformers and Program Semantics.
Edit: Here[1] is a nontrivial example, although it assumes some familiarity and doesn’t spell everything out. When you get to the acknowledgments section it’s clear why.
Yes, I will concede that of course PTS can be used in a "first the invariants, then the code" way, and so my distinction doesn't make much sense here.
Nevertheless, PTS is for verifying imperative programs, and if whenever possible, I prefer reasoning about more elegant and simple mathematical objects instead.
I find a lattice of functions from a predicate to another predicate to be about as simple as possible for describing the problem space. I’ve still got a lot to learn, but given that state is a reality for the computing automata we can actually build I like an abstraction that captures it directly.
I have found this to be an interesting exchange of ideas and I appreciate your taking the time to answer my queries. Thanks!
> First get mathematicians to actually like doing proofs with the help of a computer.
Many are using type theoretic and HOL theorem proves already. What threshold do we need to reach? Isn't Lean HoTT? Isn't that pretty much good enough (modulo UI and tooling)? I ask the latter question because that was the original promise of HoTT but I haven't kept up to date on Lean, so I'm asking.
No. Lean is good old fashioned Martin-Löf type theory. HoTT is that type theory + univalence + higher inductive types. Lean actually has proof irrelevance, which is incompatible with HoTT.
But the good news is you don't need HoTT to verify software. Type theory is already quite capable of it, despite what others in this thread would like you to believe.
That's great stuff, which shows what can in principle be done with this technology. But why is a team necessary to formalise Scholze's ideas? Note that Scholze himself doesn't touch Lean. In my opinion, the threshold is reached when Scholze himself sits down DURING the development of his ideas to interact with the proof assistant and develop+verify his ideas.
Most mathematicians are doing work that is, frankly, formally unsound. There's a huge culture of hidden assumptions in most mathematical fields.
Not to mention that the syntax is literally unparseable. For example what does this mean?
sin(x) + cos(x)
Most mathematicians would say it's the sum of the sine of x and the cosine of x. But it parses fine as the sum of the product of s, i, and n(x), and the product of c, o, and s(x). Obviously that kind of ambiguity is unacceptable in a formal procedure syntax like that used by programming languages. That's just a simple example, traditional math is full of this kind of thing.
> But it parses fine as the sum of the product of s, i, and n(x), and the product of c, o, and s(x).
No it doesn't because math lexes greedily (and is also context sensitive anyway.) 'sin' is a symbol just like 'x'. There's no ambiguity in your example.
It's a weird example, but you definitely got a point. Of course, an ambiguity like in your example never causes a problem, because this text is parsed by humans, not by machines!
> The threshold is mathematicians opting to use these tools themselves for their work. There are not many mathematicians using them so far.
The kinds of theorems that we need to prove in software are much more elementary than what mathematicians are proving. I think software engineering can benefit from it long before mathematicians start doing cutting-edge math in it.
Not if you do software the right way. Try to prove correctness of a CAD program, for example. Or of a graphics card implementation. Or ...
Furthermore, I don't think cutting-edge math needs a much different approach from cutting-edge software. You need to be able to express your thoughts succinctly, and have the tools to reason about them.
It is often said that software verification is different because there is much more to verify, but on a more shallow level. I instead think software is just not done at the right level of abstraction.
Software is at the same time more and less than math. More, because in addition to understanding a topic, you also need an implementation, which has additional issues like speed and memory usage, battery life, etc. Less, because if you do a nice implementation, nobody is asking about its correctness, or how well you understood the topic in the first place. For software today, a nice implementation is much more important than a correctness proof.
> Try to prove correctness of a CAD program, for example. Or of a graphics card implementation. Or ...
You don't need to verify the entire program for formal verification to be useful. You can adopt it incrementally.
The most common bogus argument I hear against formal verification is that it's impractical to come up with a spec or proof for the entire program, so we might as well not even bother with formal verification at all.
Formal verification is just very costly and has diminishing returns. Let's take a CAD program. Which aspects of it would you formally verify? If you are going for the easy parts, those can already be dealt with nicely with static typing and testing, essentially push-button automated verification. If you are going for the interesting parts, you will be doing math, essentially.
> Let's take a CAD program. Which aspects of it would you formally verify?
Any large program will contain some smaller components with relatively well-defined behavior. CAD is not my specialty, so I can't really comment on what algorithms are used in that domain. Forgetting about fancy algorithms for a moment, just having a more expressive type system will allow you to express invariants in your code like the fact that array indices are within the relevant bounds, that you never try to pop an empty stack, etc.—everyday programming issues.
For a more concrete example, lately I've been using Coq to formally verify critical properties about a certain type of graph-like data structure I'm using in a system I'm building.
> If you are going for the easy parts, those can already be dealt with nicely with static typing and testing, essentially push-button automated verification.
Most engineers are already writing tests and using static types. Yet, we still have buggy programs.
And just to be clear, the kind of formal verification we're talking about is based on static typing. It's just a more expressive type system than what most programmers are used to.
> If you are going for the interesting parts, you will be doing math, essentially.
You are doing some form of math, but not the kind of cutting edge math that mathematicians do—which was my original point. You are not going to run into the kinds of tricky problems that mathematicians run into with theorem proving software, like universes being too small etc. Most data in software engineering is finite and reasoning about it involves little more than arithmetic and induction (which is just out of reach for mainstream type systems, but not for the kind of type systems used in proof assistants).
First, theorem proving is NOT the same as an advanced form of static typing. This is a misunderstanding mostly pushed by computer scientists. Instead of propositions as types, I advocate a more practical form of types, based on Abstraction Logic [0, 1].
Second, yes of course, you can carve out components and concentrate on those. If you can find opportunities for this, great! You will still have buggy programs in which you use those components, to copy your argument.
Third, data may be finite, but reasoning about it is often done better in an infinitary context. After all, x^2 + x - 3 is also a finite expression, and much easier to understand than most software. So what? You will find a lot of interesting mathematics done with polynomials, some of it cutting-edge. Saying your software doesn't need cutting-edge math is just limiting yourself and your software. Chances are you will be doing some new (=cutting-edge) math if you try to verify new things. And yes, I run into problems with universes all the time actually, because this is relevant for modular formalisations. It's best to just have a single mathematical universe!
> First, theorem proving is NOT the same as an advanced form of static typing.
This Hacker News post is about a theorem prover based on dependent types. That's the context for our discussion.
> You will still have buggy programs in which you use those components
No one is disagreeing with this claim. But eliminating some bugs is better than nothing, even if you don't eliminate all bugs. You and the other commenters repeating this strawman are doing a lot of harm to people trying to socialize their research.
> Chances are you will be doing some new (=cutting-edge) math if you try to verify new things.
Citation needed. Most software is not doing anything interesting at all from a mathematical perspective, just shuffling data around. But either way the point is moot—Martin-Löf type theory (which is what this "magmide" thing seems to be based on) can do arbitrarily fancy math if needed (which is rarely).
I've been verifying bits of software for about 10 years, and I've never needed to invent new math to do it (though I would be happy if I ever did!).
Well, if you created a new data structure not known before, and proved theorems about it, that's new math. If you copied a well-known data structure, and prove theorems about it, that's not new math.
What do you think mathematicians do? They just examine certain things rigorously and with utmost scrutiny. These things are simpler than things appearing in real-life. Software interfaces with real-life, so cutting-edge math is really a subset of what's needed for software. That is obvious to me, but I don't have a citation. You can cite me, if you want to.
Finally, dependent types as it is done today in Coq and Lean etc. is not nice enough a logic to attract mathematicians. The reason for that is that it is not nice enough a logic, full stop. So why would it be nice enough for computer scientists? Oh, because your problems are simpler, so you don't need a nice logic?
Saying that Martin-Löf type theory can do arbitrarily fancy math is both false and right. Just as saying that anything can be programmed in assembler is both false and right. Yeah, with enough focus and discipline you probably can, but who would want to?
> At this point, however, you may be feeling that type theory sounds very complicated. Lots of different ways to form types, each with their own rules for forming elements? Where is the simplicity and intuitiveness that we expect of a foundational theory?
It's just not there, Mike. Type theory is neither simple nor intuitive, and it doesn't make a good foundation.
Type theory certainly has advantages, but it gets lost in esoteric texts like Mike's. There are two simple advantages over set theory within first-order logic: general variable binding, and that things are kept separate, for example that a boolean is not a number. Now, how to do general variable binding without static types I show in [0]. How to still keep things separate, in [1]. All without having to divide the mathematical universe into static types.
I am aware of Lean and mathlib. It's the group around Kevin Buzzard which is doing mathlib, and it is for sure a great effort. Being practical, they latched onto the theorem proving facilities that are currently available, Lean certainly being one of the best at this moment. But just because they can make type theory work for them, with much effort, doesn't mean that it is the right way to do formal math. They are still a drop in the ocean compared to all the mathematicians who have never touched a proof assistant. I find it telling that in the Liquid Tensor experiment Peter Scholze didn't touch Lean himself, but worked through this group as intermediaries, who mediated the type theory details for him.
The state of proof assistants is such that currently almost all of them are based on some form of type theory. There are exceptions like Mizar and Metamath (Zero), which are based on first-order logic. Type theory seemed to be the only way to have proper general variable binding, via lambdas, and that is the main reason for its popularity today. Type theory is a particular mathematical theory, though, and not particularly well suited to most things mathematicians like to do, like forming completions or subsets of types/sets. Type theory also cannot deal with undefinedness properly. Of course, there are methods of working around these issues like coercions and option types, but they are cumbersome.
Until recently I also thought that a minimum of type theory is necessary, to get general variable binding. For example, I tried to embed set theory within simply-typed higher-order logic ([2]).
But since last year I know that you don't need type theory for variable binding (see [0])! Of course, (dependent) types are still useful, but now you can work with them in a much more flexible way, without having to divide the mathematical universe into separate STATIC and A PRIORI chunks labelled by types (see [1], but that needs to be updated with the understanding gained from [0]).
If type theory works for you, fine. But I know there is a better way.
You might also want to check out nuPRL. It uses a completely different approach (computational types). It might be a better match for what you are talking about.
I've heard of nuPRL, it seems to be also based on type theory, with special emphasis on constructivity. It's basically the same as Coq, foundationally. At a summer school Bob Constable once said that he would refuse to fly in a plane with software which had been verified formally using classical logic, instead of constructive logic... Well, I wouldn't mind an intuitionistic verification, but I'd definitely take even "just" a classical one.
No it is actually quite different. nuPRL starts with an untyped programming language and you then prove that an untyped expression has a certain behaviour. The behaviour is called a Type but it is fundamentally a very different idea from Martin Loff type theory (IMHO). They do say that it is MLTT, and in principle they are right, but MLTT is as powerful as set theory so that is true of anything mathematically. LEAN for example supports non-constructive mathematics. But it is still based on a type theory. Anyways … YouTube has a great talk about the ideas: https://www.youtube.com/watch?v=LE0SSLizYUI
It literally is. Martin Luff Type Theory is the mathematical foundation used by most proof assistants today. The type checker is the prover. If your code type checks then you have proven a theory. I highly recommend learning more about the Curry Howard correspondence and Dependent Types. It might just blow your mind.
See my other answers. Curry-Howard is interesting, but making it the foundation of theorem proving is a choice (in my opinion, not a very good one), not a necessity, as most Curry-Howard fans seem to think.
That's silly. Whether it's a good or bad metatheory for theorem proving depends entirely on one's goals and preferences. Most mathematicians use a type theory over something else, so it's hard to even take your view seriously.
On top of that, you seem very opinionated for someone who was just confused about the topic of this thread. I have no idea why you're waging a holy war over this but maybe take a break.
Most mathematicians don't even use formal logic. For sure they don't use type theory! You seem to be the one who is silly/confused here. If you want to lift your confusion, read the [0] link I gave above.
All of those problems can be proven correct today using Coq/HOL/LEAN etc. see CompCert, seL4 etc. The math needed by Computer Scientists to prove correct is very different from what Mathematicians care about. Look at the mathlib project done by mathematicians in LEAN. It is pretty much useless if you want to prove code correct.
> The threshold is mathematicians opting to use these tools themselves for their work. There are not many mathematicians using them so far.
No, you didn't answer the question at all. How many is many? How many will be enough for you? It is being used by mathematicians and for some pretty important things.
There was a version of Lean that supported HoTT, and I think it helped in making Lean popular. But that support has been dropped in Lean 3, and Lean 4 does not support it either. Lean 4 itself seems to be a radical rewrite, and libraries written for Lean 3 do not work in Lean 4.
> 5) Too much gratuitous abstraction. The old version was "everything is predicate calculus". The new version is "everything is a type" and "everything is functional".
Total functional programming with types is literally the same as writing proofs in intuitionistic logic, in a technical sense (the Curry Howard correspondence). This isn't a new fad. It's a deep result that was known more than half a century ago.
Yes. As I wrote 40 years ago:
"There has been a certain mystique associated with verification. Verification is often viewed as either an academic curiosity or as a subject incomprehensible by mere programmers. It is neither. Verification is not easy, but then, neither is writing reliable computer programs. More than anything else, verifying a program requires that one have a very clear understanding of the program’s desired behavior. It is not verification that is hard to understand; verification is fundamentally simple. It is really understanding programs that can be hard."[1]
What typically goes wrong is one or more of the following:
1) The verification statements are in a totally different syntax than the code.
2) The verification statements are in different files than the code.
3) The basic syntax and semantics of the verification statements are not checked by the regular compiler. In too many systems, it's a comment to the compiler.
4) The verification toolchain and compile toolchain are completely separate.
None of this is theory. It's all tooling and ergonomics.
Then, there's
5) Too much gratuitous abstraction. The old version was "everything is predicate calculus". The new version is "everything is a type" and "everything is functional".
[1] http://www.animats.com/papers/verifier/verifiermanual.pdf