> In C++, static checkers/analyzers are separate tools. You could choose to requ...

duneroadrunner · on Nov 17, 2016

Well, like I said in the other comment, you guys could fix that by unbundling the static checker in the Rust compiler and making it applicable to (a subset of) C++ code as well :)

So then would you agree with the notion that (a practical subset of) C++ combined with a static analyzer could be just as safe and fast as Rust if, hypothetically, there existed an enthusiastic community comparable to Rust's? Or are there intrinsic technical issues? Or syntax issues?

Also, let me throw this notion at you: Rather than disallow code that can't be verified to be (memory) safe, the compiler could instead inject runtime checks that would be optimized out using the same analysis that the static checker uses.

That is, instead of requiring that the code be fast and safe or it won't compile, it becomes: If your code is not clearly, intrinsically safe then it will have runtime checks that will slow it down. And the compiler could list any runtime checks that it wasn't able to optimize out.

The reason I suggest this is that memory safety is just the enforcement of certain invariants. There's no reason why we couldn't let the programmer define additional, application specific invariants and have the build process treat them the same way it treats memory access invariants.

So for example, when a user defines a class, it could have a standard member function called "assert_object_invariants()" or something, that the programmer can define. Then anytime a (non-const?) member function is called, the compiler can insert runtime asserts at the beginning and end of the member function call. And again the compiler can tell you when those runtime asserts aren't optimized out. Wouldn't that make sense? I haven't really thought it through.

Manishearth · on Nov 17, 2016

> Well, like I said in the other comment, you guys could fix that by unbundling the static checker in the Rust compiler and making it applicable to (a subset of) C++ code as well :)

The problem is that you still need extra annotations. Namely lifetime annotations (or something similar relating between borrows -- either that, or use a lot of elision which can be crippling). On top of that, the programming style Rust encourages is not the same as the ones you tend to see in C++ codebase, and programming in the C++ style will lead to code that doesn't compile.

> Rather than disallow code that can't be verified to be (memory) safe, the compiler could instead inject runtime checks that would be optimized out using the same analysis that the static checker uses.

This might be more tractable (and is an interesting idea). But that optimizer would be hard to write.

> So then would you agree with the notion that (a practical subset of) C++ combined with a static analyzer could be just as safe and fast as Rust

I think this is what the new ISOCPP core guidelines are trying to do? Though they don't go far enough in preventing memory unsafety IIRC (this may have changed).

duneroadrunner · on Nov 17, 2016

> The problem is that you still need extra annotations. Namely lifetime annotations

Well, the idea is not to have the static analyzer verify typical C++ code. Just some practical subset. So for example I think it's quite practical to write C++ code that uses only "scope" pointers (basically pointers to objects on the stack) and (not-null) refcounting pointers, that intrinsically don't outlive their targets. Lifetimes would be implied by the types. So wait, what more does Rust's static analyzer give us again? Does it somehow remove the need for refcounting heap objects?

> the programming style Rust encourages is not the same as the ones you tend to see in C++ codebase, and programming in the C++ style will lead to code that doesn't compile.

I have no problem with that. I have no attachment to the "traditional" C++ programming style.

> This might be more tractable (and is an interesting idea). But that optimizer would be hard to write.

Why? The static analyzer has an opinion on whether or not a program is safe. The optimizer just wants to know if it still thinks it's safe when you remove a runtime check.

> I think this is what the new ISOCPP core guidelines are trying to do? Though they don't go far enough in preventing memory unsafety IIRC (this may have changed).

The ISOCPP core guidelines approach is to recommend the use of C++'s intrinsically dangerous elements in a way that is "usually safe", but not always, and rely on their static analyzer to catch bugs. So the question becomes, what do you do in the many cases where the static analyzer doesn't know if it's safe or not. You can try to redesign your code so the static analyzer can understand that it's safe. But that's often very inconvenient or has a performance cost. Often the most practical (safe) solution is to resort to something like SaferCPlusPlus.

Manishearth · on Nov 17, 2016

> So wait, what more does Rust's static analyzer give us again? Does it somehow remove the need for refcounting heap objects?

Refcounting is rarely needed because most sharing is done via "borrows", which usually work via scope-tied "references" which may point to either the stack or the heap.

Implementing and enforcing local scope pointers in C++ via static analysis is not hard. Making it possible to thread borrows through APIs and annotate things with the borrowing semantics (which is what makes Rust avoid refcounting or even allocation costs) requires a bit more work.

> I have no attachment to the "traditional" C++ programming style.

Right, but at this point you have a very weird looking subset of C++ that can't seamlessly integrate with other libraries, and can't be translated to from regular C++ without significant human intervention -- why not just use Rust?

> Why? The static analyzer has an opinion on whether or not a program is safe. The optimizer just wants to know if it still thinks it's safe when you remove a runtime check.

I guess I misunderstood your proposal. This sounds doable. But, again, you'd be using a weird subset of C++ that doesn't seamlessly integrate, and you're just better off using Rust at this point.

Instead of trying to port Rust's guarantees to C++ it makes more sense to use the same principles to organically build on top of C++, in a different way. IMO this is sort of what ISOCPP is trying to do, but they're not quite there yet, and trying to find a compromise between making the language too different and making it safe is hard.

> So the question becomes, what do you do in the many cases where the static analyzer doesn't know if it's safe or not. You can try to redesign your code so the static analyzer can understand that it's safe.

This is always going to be a problem regardless of the static analyzer. You have to design it to reject these cases. Rust does this too; there are some edge cases where you need to design around the borrow checker (though usually this doesn't incur additional cost, and the most common of these are going to be addressed). If designing low level abstractions like vectors and stuff (or doing FFI), Rust gives you an escape hatch ("unsafe"), which has a couple of checks disabled and can be used to write the code you need (verifying safety of a program then just requires verifying that these blocks of code are sound and do not rely on any invariants that can be broken by code outside of them).

duneroadrunner · on Nov 17, 2016

> > Why? The static analyzer has an opinion on whether or not a program is safe. The optimizer just wants to know if it still thinks it's safe when you remove a runtime check.

> I guess I misunderstood your proposal. This sounds doable. But, again, you'd be using a weird subset of C++ that doesn't seamlessly integrate, and you're just better off using Rust at this point.

My proposal is sort of language independent. I'm just suggesting a better way to address the code safety/correctness issue might be with runtime asserts, because it's more general. Some of the runtime asserts (like the ones regarding memory safety) will be automatically generated by the compiler, and others would be user defined (but compiler placed). And the static analyzer (I guess "the borrow checker" in Rust) would be repurposed to strip out the unnecessary runtime checks. And the compiler/optimizer would tell you which runtime asserts it was unable to optimize out. (Presumably good Rust code would result in all the memory runtime asserts being optimized out.)

This allows for programs that are not just memory safe, but "application invariant" safe as well. Right? I mean it's not really a totally new concept, I guess it's kind of "design by contract" or whatever, but with a slight performance bent because the optimizer tells you what runtime checks it's having trouble getting rid of. And maybe there would be a way to indicate that you expect the optimizer to be able to get rid of certain runtime checks, and instruct it to generate a warning (or error) if it doesn't. I'm just sayin'...

pcwalton · on Nov 17, 2016

I don't think it works. All of the "runtime asserts" require bookkeeping. That bookkeeping ends up being worse in terms of performance than what you have with a GC.

It's hard to beat a modern, tuned GC.

duneroadrunner · on Nov 17, 2016

> Right, but at this point you have a very weird looking subset of C++

It's a little weird looking at first glance, but ultimately it's not really that weird. The main unfamiliar thing is that objects that are going to be the target of a (safe) pointer need to be declared as such. So

    {
        std::string s1;
        auto s1_ptr = &s1;
    }

becomes

    {
        mse::TXScopeObj<std::string> s2;
        auto s2_ptr = &s2;
    }

s2 acts just like a regular string. It's just wrapped in a (transparent) type that overloads the & (address of) operator so that s2_ptr is a safe pointer. (For example, in this case s2_ptr cannot be retargeted or set to null).

> that can't seamlessly integrate with other libraries,

Sure it can, that's the point. For example:

    {
        std::string s1 = "abc";
        mse::TXScopeObj<std::string> s2 = "def";
        auto s2_ptr = &s2;
        std::string s3 = s1 + s2; // s2 totally works where an std::string is expected
        s3 += *s2_ptr;
        *s2_ptr = s1; // and vice versa
    }

> and can't be translated to from regular C++ without significant human intervention --

Umm, it could be automated, but you would need a tool that can recognize object declarations. But modern C++ code is mostly safe already. I mean you're supposed to try to avoid pointers in favor of standard containers and iterators. So just replace your "std::vector"s with "mse::mstd::vector"s and your "std::array"s with "mse::mstd::array"s and you're mostly there.

> why not just use Rust?

My impression is that Rust has been evolving a lot. Is the language stable now? Is it time to jump in? Has it vanquished D as the successor to C++? Are we happy with Rust's solution for exceptions?

Even if Rust is the future, and the future is here, I'm still stuck with existing C++ projects. And I'd feel better if they were (at least mostly) memory safe. There must be others in the same boat.

lmm · on Nov 17, 2016

> It's a little weird looking at first glance, but ultimately it's not really that weird.

Readability is important for maintainable code. And safe coding patterns tend to involve a lot of sum types (which you can model in C++ with the visitor pattern, but it's significant overhead in code length and possibly even at runtime), and a fair amount of generics (which are cumbersome in C++, and the error reporting is awful). If you're not going to get the existing tool/library infrastructure either way, so you're just evaluating on their merits as languages, I don't think you'd ever want to pick C++ over Rust.

> modern C++ code is mostly safe already.

I've been hearing that for about a decade now (and I suspect the only reason it isn't longer is that I wasn't programming before then). And yet we still see bugs, all the time. Not subtle bugs, but stupid, obvious bugs.

> Is the language stable now?

Yes, as of 1.0.

> Is it time to jump in? Has it vanquished D as the successor to C++? Are we happy with Rust's solution for exceptions?

Yes.

> Even if Rust is the future, and the future is here, I'm still stuck with existing C++ projects. And I'd feel better if they were (at least mostly) memory safe.

My belief is that no amount of whack-a-mole is going to make those projects memory-safe, and none of the linters/checkers/dialects is ever going to reach a point where it offers actual guarantees. If it were possible it would have happened by now. The only way you're going to get to memory safety is by rewriting those projects, bottom to top (which is probably what you'd have to do to use one of these C++ dialects anyway). If you want to do the migration gradually (and you should!) rust has pretty good interop.

Manishearth · on Nov 17, 2016

> The main unfamiliar thing is that objects that are going to be the target of a (safe) pointer need to be declared as such.

Your proposal was to take Rust's static analysis and make it work with C++. It's clear you don't know Rust. Why are you so confident about what kind of effect that would make on the language? Rust is not "like C++ but with more static analysis", it's a very different language. A lot of the safety that modern C++ gets you is something that Rust gets you, using different mechanisms.

> Sure it can, that's the point. For example:

This example seems to be a SaferCPlusPlus example? I'm talking specifically about your proposal to take Rust's static analysis and use it on C++. That isn't what SaferCPlusPlus seems to be doing. It seems like you might be talking about something else? The general applicability of safety based static analysis? I'm not arguing with that.

> My impression is that Rust has been evolving a lot. Is the language stable now?

Still evolving, just like C++ is, but is stable now. Has been for more than a year.

> Are we happy with Rust's solution for exceptions?

I am. Most folks in the Rust community are. There are no missing pieces now, though.

> Has it vanquished D as the successor to C++?

No, and that's subjective, and your C++-with-Rusts-static-analysis will not be in a different boat.

> I'm still stuck with existing C++ projects. And I'd feel better if they were (at least mostly) memory safe.

That's my point. The amount of work to convert existing C++ code to something that satisfies a static analyzer using Rust's exact set of invariants is just as much as the work required to convert to Rust. You won't be able to just throw a new static analyser at C++ code and stuff will magically work. It will require significant refactoring and effort. Nor will your code be able to easily talk with other C++ libraries.

> Umm, it could be automated

No, "human intervention" I said. It can't be automated easily, because the style it enforces is significantly different. I've done quite a bit of jumping back and forth between C++ and Rust these days (in the same codebase, with FFI), and the fact that the structure and style of programs is different is very apparent.

There is work on translating C to Rust (and might grow to C++ some day?), but IIRC you still need significant human intervention. For C at least there is no existing safety system to replace, so it's still easier, but translating from C++s (largely incompatible) existing safety system will be tough.

Translating code will need the translator to figure out what the code is trying to do, basically. This isn't like Python2->Python3. Like I said, the style enforced is different. I don't mean syntax style, I mean how code is structured at a higher level.

> I mean you're supposed to try to avoid pointers in favor of standard containers and iterators

If you want to be 100% safe you need to solve iterator invalidation and Rust's solution is something that is very hard to make work with C++s usual style of coding. If you want to avoid all unnecessary allocations and refcounting you need a lifetime system. To use Rust's model the mechanism of moving would have to be tweaked considerably.

Again, these problems can probably be solved organically from C++ itself (which I guess is what SaferCPlusPlus is doing?), building a static analyser that tries to solve them building on the existing mechanisms in C++. But importing Rust's analysis will just get you a completely new language which has almost no use.

duneroadrunner · on Nov 18, 2016

> It's clear you don't know Rust.

Oh yeah, didn't mean to give the impression otherwise. But I think I've gained some understanding since yesterday. I'm just learning, but tell me if this I'm getting this at all:

- Rust only considers scope lifetimes (and "static" lifetime which is basically like the uber scope)?

- References can only target objects with a superset (scope) lifetime.

- You can only use one non-const reference to an object per scope. This solves the aliasing issue?

> This example seems to be a SaferCPlusPlus example? I'm talking specifically about your proposal to take Rust's static analysis and use it on C++.

Sorry, I misunderstood. I thought you'd switched context. Let me try again:

There are a couple of reasons for pursuing "Rustesque" programming in C++ as opposed to in Rust itself. First let me point out that there would have to be a mechanism for distinguishing between "statically enforced" safe blocks of C++ code and the rest of the code (just like Rust's "unsafe" blocks I guess).

So then the obvious advantage is a better interface to C++ code and libraries. Rust only supports plain C (FFI) interfaces? Is that right?

But another argument is that there multiple strategies to achieve memory safety (and code safety in general). The two popular ones are the Rust strategy and the GC strategy. One is not uniformly superior to the other. Superior maybe, but not uniformly so. Presumably the Rust strategy will be more memory efficient, and maybe theoretically faster, whereas the GC strategy might facilitate higher productivity.

If you choose Rust, you're committed to one strategy. Now, I don't know if it'll turn out to be realistic, but I'm wondering if it's possible that C++ can support both strategies. (And maybe some other ones too.) Not just different strategies in different applications, but even in the same application. The Rust static analyzer would of course only work on indicated blocks of code.

Of course writing code in one strategy or another would be more clunky in C++ than a language specifically designed for it, but everything's a trade-off. The question is, is it worth it?

It's easy to say the clunkiness isn't worth it, but Rust probably has the weakest argument in that respect. Right? (I mean doesn't Rust have a reputation of being clunky anyway?)

Again, I barely know any Rust, but it seems to me that the main safety functionality that Rust provides over, say, SaferCPlusPlus, is the static enforcement of "one non-const reference to an object per scope" as an efficient, but restrictive, solution to the aliasing issue.

Hmm, obviously I have to find some time to learn Rust better, but intuitively, it seems like the simple Rust examples I've seen so far would have a corresponding C++ implementation, and it's not immediately obvious to me why a static analyzer couldn't work on the corresponding C++ code. Is there a simple example that demonstrates the problem? Am I just underestimating the difficulty of static analysis?

Manishearth · on Nov 18, 2016

> You can only use one non-const reference to an object per scope. This solves the aliasing issue?

More accurately, if you have a mutable reference you cannot have any other references.

> Rust only supports plain C (FFI) interfaces? Is that right?

Yes, but with bindgen you have a decent C++ interface.

My contention is that the "better interface" is only slightly better, and probably not enough to justify basically creating a whole new language. Note that for your safe RustyCPP code, the regular-C++ code will be completely unsafe to use and you'll have to write some safety wrappers that encode in the guarantees you need. I've been doing this in the Rust integration in Firefox, and I'm sure that a dialect of C++ that uses Rust's rules will need to do something similar. That's where the bulk of the integration cost comes from.

> If you choose Rust, you're committed to one strategy

I mean, you can just blindly use Rc<T> or Gc<T> in Rust (Gc<T> only exists as a POC right now but we plan to get a good one up some day).

But yeah, magical pervasive GC would be hard to do in Rust.

> The question is, is it worth it?

You're arguing between choosing Rust vs CPP-with-static-analysis. I'm arguing between choosing Rust vs CPP-with-Rust-esque-static-analysis. I think the latter strongly points towards Rust, but the former has interesting tradeoffs.

> I mean doesn't Rust have a reputation of being clunky anyway?

Not ... really? It has a reputation for having a steep initial learning curve.

> it seems like the simple Rust examples I've seen so far would have a corresponding C++ implementation

Oh, this would work. But the reverse -- taking C++ code and making it work under the Rust rules -- is very hard. Not because of the aliasing rules, but because of how copy/move constructors are used in C++ (Rust's model strongly depends on initialization being necessary), the whole duck-typed-templates thing in C++, and similar things with respect to coding patterns that don't translate well.

Again, you could build a safety system on C++ that respects these patterns, but it would not be the same as taking Rust's rules and enforcing them on C++.

pcwalton · on Nov 17, 2016

> Well, like I said in the other comment, you guys could fix that by unbundling the static checker in the Rust compiler and making it applicable to (a subset of) C++ code as well :)

No, we can't do that. It is incompatible with C++.

> Rather than disallow code that can't be verified to be (memory) safe, the compiler could instead inject runtime checks that would be optimized out using the same analysis that the static checker uses.

That is not possible. It would require massive bookkeeping, much like your library does. That would eliminate most of the benefits of Rust.