It might be the language he is looking for, but it might not, and more likely than not is not. D is one of those odd languages which most likely ought to have gotten a lot more popular than it did, but for one reason or another, never quite caught on. Perhaps one reason is because it lacks a sense of eccentricity and novelty that other languages in its weight class have. Or perhaps it's just too unfamiliar in all the wrong ways. Whatever the case may be, popularity is in fact one of the most useful metrics when ruling out a potential language for a new project. And if D does not meet GP's requirements in terms of longevity or commercial support, I would certainly not suggest GP adopt it too eagerly, simply because it happens to check off most or all their technological requirements.
Smoe of these are definitely nice-to-haves*, but when you're evaluating a C++ alternative, there are higher priority features to research first.
How are the build times? What does its package system(s) look like, and how populated are they? What are all its memory management options? How does it do error handling and what does that look like in real world code? Does it have any memory safety features, and what are their devtime/comptime/runtime costs? Does it let me participate in compile time optimizations or computations?
Don't get me wrong, we're on the same page about wanting to find a language that fills the C++ niche, even if it will never be as ideal as C++ in some areas (since C++ is significantly worse in other areas, so it's a fair trade off). But just like dating, I'm imagining the fights I'll have with the compiler 3 months into a full time project, not the benefits I'll get in the first 3 days.
* (a) I've been using structs without typedef without issue lately, which has its own benefits such as clarifying whether the type is simple or aggregate in param lists, while auto removes the noise in function bodies. (b) Not needing forward declarations is convenient, but afaik it can't not increase compile times at least somewhat. (c) I like the consistency here, but that's merely a principle; I don't see any practical benefit.
You can use exceptions or returns for error handling.
The biggest memory safety feature it has is length-delimited arrays. No more array overflows! The cost of it is the same as in std::vector when you do the bounds checked option. D also uses refs, relegating pointers to unusual uses. I don't know what you mean by "participating in optimizations".
(a) C doesn't have the hack that C++ has regarding the tag names. D has auto.
(b) D has much faster compile times than C++.
(c) The practical benefit is the language is much easier to master.
that doesn't make sense, because why would you make something opaque and expose it immediately again in the same line?
The others are ... different. I can't tell whether they are really better. The second maybe, although I like it that the compiler forces me to forward type stuff, it makes the code much more readable. But then again I don't really get the benefit of
import foo;
vs
#include <foo>
.
include vs import is no difference. # vs nothing makes it clear that it is a separate feature instead of just a language keyword. < vs " make it clear whether you use your own stuff or stuff from the system. What do you do when your file contains spaces? Does import foo bar; work for including a file a single file, named "foo bar"?
It's inelegant because without the typedef, you need to prefix it always with `struct`. This is inelegant because all other types do not need a prefix. It also makes it clumsier to refactor the code (adding or subtracting the leading `struct`). The typedef workaround is extremely commonplace.
> I like it that the compiler forces me to forward type stuff, it makes the code much more readable
That means when opening a file, you see the first part of the file first. In C, then you see a list of forward references. This isn't what you want to see - you want to see first the public interface, not the implementation details. (This is called "above the fold", coming from what you see in a folded stack of newspapers for sale. The headlines are not hidden below the fold or in the back pages.) In C, the effect of the forward reference problem is that people tend to organize the code backwards, with the private leaf functions first and the public functions last.
> include vs import is no difference
Oh, there is a looong list of kludgy problems stemming from a separate macro processor that is a completely distinct language from C. Even the expressions in a macro follow different rules than in C. If you've ever used a language with modules, you'll never want to go back to #include!
> What do you do when your file contains spaces?
A very good question! The module names must match the filename, and so D filenames must conform to D's idea of what an identifier is. It sounds like a limitation, but in practice, why would one want a module name different from its filename? I can't recall anyone having a problem with it. BTW, you can write:
import core.stdc.stdio;
and it will look up `core/stdc/stdio.d` (Linux, etc.) or `core\stdc\stdio.d` on Windows.
We obviously disagree with the coding organization we prefer, so I find that rather elegant, but this doesn't sound like a substantial discussion. You as the language author are obviously quite content with the choices D made.
> This is inelegant because all other types do not need a prefix.
I don't find that. It makes it rather possible to clearly distinguish between transparent and opaque types. That these are a separate namespace makes it also possible to use the same identifier for the type and object, which is not always a good choice, but sometimes when there really is no point in inventing pointless names for one of the two, it really is. (So I can write struct message message; .) It also makes it really easy to create ad-hoc types, which honestly is my killer feature that convinced me to switch to C. I think this is the most elegant way to make creating new types for single use, short of getting rid of explicit types altogether.
> It also makes it clumsier to refactor the code (adding or subtracting the leading `struct`).
I never had that problem, and don't know when it occurs and why.
> The typedef workaround is extremely commonplace.
In my opinion that is not a workaround, but a feature. I also use typedefs when I want to declare an opaque type. This means that in the header file all function declarations refer to the opaque type, and in the implementation the type is only used with "struct". This also makes it obvious which types internals you are supposed to touch and which not. (This is also what e.g. the Linux style guide recommends.)
> This isn't what you want to see - you want to see first the public interface, not the implementation details.
Maybe you, but I don't. As in C public interface and implementation are split into different files, this problem doesn't occur. When I want to see the interface, I'm going to read the interface definition. When I look into the implementation file, I definitely don't expect to read the interface. What I rather see is first the dependencies (includes) and then the internal types. This fits "Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious." . Then I typically see default values and configuration. Afterwards yes, I see the lowest methods.
> people tend to organize the code backwards, with the private leaf functions first and the public functions last.
Which results in a consistent organization. It also fits how you would write in in math or an academic context, that you only use what is already defined. It makes the file readable from top to bottom. When you are just looking for a specific thing, instead of trying to read it in full, you are searching and jumping around anyway.
> Oh, there is a looong list of kludgy problems stemming from a separate macro processor that is a completely distinct language from C. Even the expressions in a macro follow different rules than in C. If you've ever used a language with modules, you'll never want to go back to #include!
A macro language is surprising for the newcomer, but you get used to it, and I don't think there is a problem with include. Textual inclusion is kind of the easiest mental modal you can have and is easy to control and verify. Coming from a language with modules, before learning C, I never found that to be an issue, and rather find the emphasis on the bare filesystem rather refreshing.
> but in practice, why would one want a module name different from its filename?
True, I actually never wanted to include a file with spaces, but it is something where your concept breaks. Also you can write #include "foo/bar/../baz" just fine, and can even use absolute paths, if you feel like it.
> macro language is surprising for the newcomer, but you get used to it
This was one of the biggest paradigm shifts for me in mastering C. Once I learned to stop treating the preprocessor as a hacky afterthought, and realized that it's actually a first-class citizen in C and has been since its conception, I realized how beautiful and useful it really is when used the way the designers intended. You can do anything with it, literally anything, from reflection to JSON and YAML de/serialization to ad hoc generics. It's so harmonious, if unsightly, like the fat lady with far too much makeup singing the final opus.
D accomplishes this by using Compile Time Function Execution to build D source code from strings, and then inline compiling the D code. Learning a macro language is unnecessary, as it's just more D code.
> You can do anything with it, literally anything, from reflection to JSON and YAML de/serialization to ad hoc generics.
Wow. Do you have any pointers? I always thought random computation with it is hard, because it doesn't really wants to do recursion by design. Or are you talking about using another program as the preprocessor?
Yes, because I also don't know what this is supposed to mean? The product of two addresses? Dereferencing one pointer, and then combining them without an operator? And what's the type going to be, "pointer squared"?
Also what has this to do with the current discussion?
> Also what has this to do with the current discussion?
The point is C does not allow doing anything you want. The C type system, for example, places all kinds of restrictions on what code can be written. The underlying CPU does not have a type system - it will multiply two pointers just fine without complaint. The CPU does not even have a concept of a pointer. (The C preprocessor doesn't have a notion of types, either.)
The point of a type system is to make the code more readable and reduce user errors.
We have a difference of opinion on C. Mine is that C should have better rules to make code more readable and reduce user errors. Instead it remains stuck in a design from the 1970s, and has compromised semantics that result from the severe memory constraints of those days. You've defended a number of these shortcomings as being advantages.
Just for fun, I'll throw out another one. The C cast syntax is ambiguous:
(T)(3)
Is that a function call or a cast of 3 to type T? The only way to disambiguate is to keep a symbol table of typedef's so one can determine if T is a type or not a type. This adds significant complexity to the parser, and is completely unnecessary.
The fix D has for this is:
cast(T)(3)
where `cast` is a keyword. This has another advantage in that casts are a blunt tool and are associated with hiding buggy code. Having `cast` be easily searchable makes for better code reviews.
> The point is C does not allow doing anything you want.
I thought we were discussing specific issues, I did not claim, that C doesn't have things that could be different. For example the interaction of integer promotion and fixed size types is completely broken (as in you can't write correct portable code) in my opinion.
> The C type system, for example, places all kinds of restrictions on what code can be written. The underlying CPU does not have a type system - it will multiply two pointers just fine without complaint. The CPU does not even have a concept of a pointer.
As you wrote a pointer is not an address. The CPU lets you multiply addresses, but C also let's you multiply addresses just fine. The type for that is uintptr_t. Pointers are not addresses, e.g. ptr++ does not in general increment the address by one.
> The C preprocessor doesn't have a notion of types, either.
It doesn't even have a concept of symbols and identifiers, which makes it possible for you to construct these.
> You've defended a number of these shortcomings as being advantages.
Because I think they are. It's not necessarily the reason why they are there, but they can be repurposed for useful stuff and often are. Also resource constraints often result in a better product.
I still only declare variables at the begin of a new block, not because I wouldn't write C99+, I do, but because it makes the code easier to read when you can reason about the participating variables up front. I can still introduce a variable when I feel like, just by starting a new block. This enables me to also decide when the variables go out of scope again, so my variables only exist for the time, I really want them to, even if that is only for 3 lines.
> Just for fun, I'll throw out another one.
That's just a minor problem in compiler implementation, and doesn't result in problems for the user. Using the same symbol for pointer dereference and multiplication is also similar.
(a) *b
Is that a cast or a multiplication? These make for funny language quizzes, but are of rare practical relevance. Real world compilers don't completely split syntactic and semantic parsing anyway, so they can emit better diagnostics and keep parsing upon an error.
> You've defended a number of these shortcomings as being advantages.
My initial comment was about a shortcoming, which doesn't actually exist.
C++ does not allow forward references outside of structs. The point-of-instantiation and point-of-declaration rules for templates produces all kinds of subtle problems. D does not have that issue.
Yes, you absolutely can get the job done with C and C++. But neither is an elegant language, and that puts a cognitive drag on writing and understanding code.
I'm sorry, is this an in-joke or satire or something? I can't tell really. Maybe a woosh moment, and as others have said, the GP/person you are speaking about, Walter Bright, is the creator of the D language. Maybe you didn't read your parent's post? Not saying its intentional, but it almost seems rude to keep speaking in that way about someone present in the conversation.
Bonjour monsieur, je voudrais une eclair chocolate s'il vous plait is my number one French phrase. I can usually sell it. Maybe you needed the prefixes?
The good ole' Z80 assembly code is right there unaltered on the right, but it executes using C macros. In my humble consumer laptop I get a 40,000 times performance boost compared relative to a colleague's physical Z80 running the same code. I love the combination of nostalgia AND modern hardware performance.
I know I'm getting old when I read comments like this. It wouldn't have occurred to me in a million years that it might pair me with passengers on another flight. I'm conditioned by having first experienced this feature probably 30 years or so ago when pairing to passengers on other flights would have been science fiction.
I don't understand. On average, for every 4 input bits we will get it right 3 times writing 0.5 bits each time and get it wrong once writing 2.4 bits once. So we write a total of 3 * 0.5 + 2.4 bits = 3.9 bits. The compressed output is 3.9/4 = 97.5% as big as the input. Not very compelling. What am I misunderstanding?
It's -log2(0.75) for getting a 75% chance right and -log2(0.25) for getting it wrong. I should have stated .4 bits and 2bits respectively not 0.5 and 2.4. Sorry! Good catch.
It's 3.2 vs 4bits. Now that may not seem huge but the probabilities tend to be at the more extreme ends if the predictor is any good. Once you start going towards the 99% range you get extreme efficiency.
No one serious ever talks about "upgrading" to Tahoe without the quotes. I hope Apple are seriously embarrassed about this and determined to mend their ways.
The only reason people get confused about the Monty Hall problem is that the problem description rarely if ever makes it clear that the host knows where the car is and deliberately chooses a different door.
It's inconceivable (for example) that Paul Erdos, a world class mathematician, would fail to solve this problem if it were actually communicated clearly.
It is incredibly annoying that in the case where the host doesn't know where the car is but opens a goat door anyway, the probability goes back to 50-50
Original rules (host knows where car is and always opens a door with a goat):
- 1/3 of the time your original choice is the car, and you should stick
- 2/3 of the time your original choice is a goat, and you should switch
Alternative rules (host doesn't know where car is, and may open either the door with the car or a door with a goat)
- 1/3 of the time your original choice is the car, the host opens a door with a goat, and you should stick
- 1/3 of the time your original choice is a goat, the host opens a door with a goat, and you should switch
- 1/3 of the time your original choice is a goat, the host opens the door with the car, and you're going to lose whether you stick or switch
So even under the new rules, you still only win 1/3 of the time by consistently sticking. You're just no longer guaranteed that you can win in any given game.
Well yes, if you throw out half of the instances where your original choice was wrong, then the chance your original choice was correct will inevitably go up.
That would indeed be annoying, but I doubt it is the case. If you only consider this scenario, it cannot be distinguished by conditional probability from the case that the host knows, and so the math should stay the same.
As usual, the problem is not an incredibly difficult problem, but just a failure to state the problem clearly and correctly.
Try to write a computer program that approximates the probability, and you'll see what I mean.
Your program shows exactly what I mean: "Impossible" cannot be non-zero, your modified question is not well-defined.
Yes, of course it depends on the host knowing where the goat is, because if he doesn't, the scenario is not well-defined anymore. This is not annoying, this is to be expected (pun intended).
The scenario is well-defined. There's nothing logically impossible about the host not knowing which door has the car, and still opening the goat door.
"Impossible" in the program just refers to cases where the host picks the car door, i.e. the path that we are not on, by the nature of the statement. Feel free to replace the word "impossible" with "ignored" or "conditioned out". The math remains the same.
No, sorry, it is not well-defined. But I should have been clearer. What is not well-defined? Well, the game you are playing. And, without a game, what mathematical question are you even asking?
You cannot just "ignore" or "condition out" the case that there is a car behind the opened door, the game doesn't make any sense anymore then, and what you are measuring then makes no sense anymore with respect to the game. In order to make it well-defined, you need to answer the question what happens in the game when the door with the car is opened.
You can for example play the following game: The contestant picks a door, the host opens one of the other doors, and now the contestant can pick again one of the three doors. If there is a car behind the door the contestant picks, the contestant wins. Note that in this game, the contestant may very well pick the open door. The strategy is now to obviously pick the open door if there is a car behind it, and switch doors if it is not. I am pretty sure, when you simulate this game, you will see that it doesn't matter if the host knows where the car is (and uses this knowledge in an adversarial manner), or not.
The game you seem to want to play instead goes as follows: If the door with the car is opened, the game stops, and nobody wins or loses. Let's call this outcome a draw, and forget about how many times we had a draw in our stats. But you can see now that this is an entirely different game, and it is not strange that the resulting stats are different than for the original game.
I know of two distinct methods of encoding any legal chess position into 24 bytes worst case. In both cases, you get the full position, plus who to move, plus full information on future castling and en-passant possibilities. This is the FEN state of the board, minus the two counts. It's more than the information you get from a published chess diagram in a book or magazine. Although in a book or magazine inevitably "who to move" is represented somehow, castling and en-passant possibilities are not usually.
Method 1: Lichess method; 64 bit header, 1 bit per square indicating occupied squares, then (up to) 32 4 bit codes for the 32 occupied squares. So 24 bytes worst case!
Method 2: My own method a list of 2 bit codes, one of the 4 codes indicates empty square, the other three codes are prefixes for a second 2 bit code. Three prefixes applied to one of 4 code values gives a total of 12 possibilities corresponding to any possible chess piece. Worst case 32x2 bits plus 32x4 bits = 24 bytes.
In each case there is sufficient entropy to create tricks to add the supplementary information (who to move etc.), similar tricks are in the original article.
I mention my own method from my Tarrasch Chess GUI https://github.com/billforsternz/tarrasch-chess-gui only for completeness. If I had known about method 1 I would have used that, it is simpler and better and there is much more entropy available making the tricks easier.
I would urge commentators to keep clear the difference between compressing a chess position (this challenge), a chess move and a chess game. A chess move needs far less bits of course. A complete chess game is always encoded as a list of moves, not positions, for this reason.
Edit: I should have mentioned that the chief advantage of method 1 over method 2 is average bits required. An empty board is 64x1 bits = 8 bytes for method 1 and 64x2 bits = 16 bytes for method 2.
Edit 2: I am going to describe my tricks, just because they are fun. Two kings the same colour means Black to move. Two White kings means the first king is white, two black kings means the first king is black. Otherwise White to move. A friendly pawn on the first rank means an enpassant vulnerable pawn. Swap it with the 4th rank square to get the actual contents of both squares. A hostile pawn on the first rank is an unmoved rook or king, establishing all castling rights. The castling and en-passant tricks can be combined in a pleasant and harmonious way.
I recall idly looking through the manual of our Bosch dishwasher when it was delivered and seeing that they offered to share GPL'ed source code from the machine's embedded guts. I thought to myself, "that's kind of interesting, I'll take them up on that". So I emailed the address they provided for this purpose. I got an auto email back saying, effectively, "No. You're not an authorised person, we don't recognise your email address, we don't know who you are, we're not going to talk to you."
Oh well. Big Corp doing what Big Corps do. Paying lip service to legal requirements, but reluctantly and with barriers that would no doubt take a lot of time and money to even try and break down.
I was troubled by my own comment. How exactly did Bosch handle this? I went back and checked and in fact the rejection email came from their email server, it was an "access denied" type email that I originally misinterpreted as a "you don't have access" type message leaving me annoyed but really I took away and remembered a wrong impression. Looking more carefully, the message doesn't mean anything subtle, it just means the email address (oss-request@bshg.com for the record) doesn't exist. Which is bad, but not nearly as bad as I portrayed it above. Apologies (for the record) to Bosch.
reply