OCaml was single threaded until this or what is this exactly? There is no descri...

gmfawcett · on Jan 10, 2022

It was very similar to the Python story: native threads, but there is a global lock on the OCaml runtime. You can fire off a thread and have it stay busy in external code (e.g. in a C library), but only one thread can be running OCaml code at a time.

For this reason (and others), forking child processes has been a common alternative on many OCaml projects.

See here for an update re: the whole multicore initiative, and links to more information:

https://discuss.ocaml.org/t/multicore-ocaml-december-2021-an...

Blikkentrekker · on Jan 10, 2022

I would still fork child processes and use i.p.c. probably until there be something similar to channels, and often even then, because signal handling and forking in a multithreaded program is quite a hurdle.

Even in Rust, no one really knows at this point what is safe to do in a fork from a multithreaded program or in a signal handler in one. Signal handlers and forks are thus simply “unsafe” in Rust with “care must be taken”, but there is no real explanation either of what care, and Rust does not document or stabilize which of it's functions are async safe, as C does.

gmfawcett · on Jan 10, 2022

If I were a Rust maintainer, I wouldn't say what was safe either. :) This is an OS issue, not a language/runtime one. For example, POSIX has a lot to say about what is safe to do, and what isn't, after a fork() without exec() [1] and your OS of choice may have more to add. As you've said, it's perilous and messy.

[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/fo...

Blikkentrekker · on Jan 10, 2022

I certainly know the underlying reasons, but Rust aspires to be a language with clear guarantees for what is safe, and what is not.

POSIX C documents which function is async safe, and specifies that only those functions may be used in certain contexts lest there be undefined behavior. Rust at this moment does not document which of it's functions are safe to use in such contexts, and I'm not sure what the situation with OCaml is.

Sharing state between threads in a language such as C is complex enough, but in a managed language there seem to be few guarantees. Go and Rust were designed with this in mind and fundamentally only allow either sharing data by way of channels, or by way of data structures whose reading and writing is race-free.

gmfawcett · on Jan 10, 2022

The received wisdom is that the only safe fork is the one followed immediately by an exec(e,l,p,v). If you can't exec(), then keep your design as simple as possible, consult your OS manual, and address your failure modes.

Asking any compiler to prove that your program's fork() is safe is asking an awful lot. :) If you're referring to Rust's notion of safe/unsafe code, "unsafe" explicitly refers to unsafe memory operations. It won't cover operational issues such as "threads won't be cloned" or "signal handlers will get dropped" during a fork. (Even if they were tracked: are those behaviours safe, or unsafe? That depends on the semantics of the program.)

Ada was another language designed with safety in mind. They chose to shove fork() into an "Unsafe POSIX Primitives" package -- in other words, use these at your own risk. I think this was a good call.

Blikkentrekker · on Jan 10, 2022

> The received wisdom is that the only safe fork is the one followed immediately by an exec(e,l,p,v). If you can't exec(), then keep your design as simple as possible, consult your OS manual, and address your failure modes.

But this is not true at all, calling any async safe function is save, including exec, which is simply an async safe function.

Furthermore, Rust's standard librarty does not even expose exec directly, but does expose a function to call arbitrary code before exec, but this function is marked as unsafe and the only documentation is that “care must be taken”.

For instance, socket activation in the launchd style that systemd and stocketd use must execute various socket related operations after forking, and before exec, these operations are async safe, and thus it is possible to do this from a multithreaded program, but in Rust there are no guarantees whether various functions dealing with sockets are async safe. They might call malloc or use mutexes internally; one has no idea.

> Asking any compiler to prove that your program's fork() is safe is asking an awful lot.

I'm not asking that at all, but it's actually not difficult at all. All it needs is a trait for async safety on functions, and any function that is marked as async safe can only call other functions internally that are. That is all it takes to do it savely.

But I'm not asking for a compiler proof; I'm simply asking that Rust document which functions are async safe and which are not, so that programmers can do it in unsafe code.

> Ada was another language designed with safety in mind. They chose to shove fork() into an "Unsafe POSIX Primitives" package -- in other words, use these at your own risk. I think this was a good call.

And they document what functions are async safe; Rust has no documentation nor guarantees about this.

gmfawcett · on Jan 10, 2022

I'm sorry if I misunderstood you. I was talking about POSIX semantics of fork(), but you're talking about a Rust documentation issue re: async. I didn't catch that earlier! I hope the issue gets sorted out for you.

yawaramin · on Jan 10, 2022

> There is no description of what the PR does which is pretty bad form.

The description is in the first paragraph of the PR body text:

> This PR adds support for shared-memory parallelism through domains and direct-style concurrency through effect handlers (without syntactic support). It intends to have backwards compatibility in terms of language features, C API, and also the performance of single-threaded code.

vrotaru · on Jan 10, 2022

OCaml had threads. There even a Threads module in standard library. The limitation was that only one thread can run at any given time.

P.S There were ways around it, like calling an extern C function which will spawn a new theead and call and pass the control back, but those were almost never used.

Mikeb85 · on Jan 10, 2022

Ish. OCaml didn't have any constructs in the language itself for parallelism. You could fork processes and things and libraries did/do exist to write parallel code in OCaml.