In principle, using a thread for saving the game state is a much more difficult problem, since the game state itself will mutate. You need to be extremely careful to assure that the state you serialize is consistent and is at least a plausible game state (even if there is a bit of wiggle room to allow for game states that technically never existed). This complexity invades every aspect of game logic; and bugs here tend to be subtle corruption of your save data that will take a while to notice; thereby obscuring the relationship to the corruption, and the save/reload cycle.
In contrast, forking with its COW semantics is conceptually easy. You just fork. The main process can continue running, and the child process gets a frozen snapshot. There is a bunch of overhead from the copy part of copy-on-write. However, most of that overhead will likely be spent in the first frame; which is still a significant improvement over the pause time associated with stop the world saving. In practice, coding for the child process is tricky. However, it is self contained and responsible only for a relatively simple problem. No complex problems to solve, just a relatively small amount of code that needs to be written carefully.
The RAM usage is a real trade-off inherent in the approach.
> You quit the game, the parent process exits, and the serialisation process gets reparented to init, invisibly using up your RAM until you reboot.
Or until the short-lived child process finishes its work and exits on its own.
If they have the ability to pause the entire game in a consistent state while its state is being saved (in the foreground save case), then they certainly have the ability to pause the entire game in a consistent state so they can fork. Just the latter pause will be much much shorter.
Ah, I see. I was assuming that the achievement of consistent state must somehow be achieved in the child.
If the parent can achieve consistent state (e.g. doing the equivalent of pressing "pause"), why not do the following instead:
While paused, memcpy the current memory to a buffer, then simultaneously {resume game, spawn thread to write the buffer to disk}. In C++ the memcpy might be even more convenient with the copy constructor.
This will introduce a short delay for the copy, at the speed of RAM bandwidth.
But that copy will need to be done anyway, straight away, as the parent poster says:
> that [copy] overhead will likely be spent in the first frame
With fork() it just happens in the kernel instead of in userspace, thus likely slower (1000s of individual sequential page faults, instead of a single contiguous allocation).
So if the fork() approach can somehow do it faster, I'd be curious via what mechanics.
In contrast, forking with its COW semantics is conceptually easy. You just fork. The main process can continue running, and the child process gets a frozen snapshot. There is a bunch of overhead from the copy part of copy-on-write. However, most of that overhead will likely be spent in the first frame; which is still a significant improvement over the pause time associated with stop the world saving. In practice, coding for the child process is tricky. However, it is self contained and responsible only for a relatively simple problem. No complex problems to solve, just a relatively small amount of code that needs to be written carefully.
The RAM usage is a real trade-off inherent in the approach.
> You quit the game, the parent process exits, and the serialisation process gets reparented to init, invisibly using up your RAM until you reboot.
Or until the short-lived child process finishes its work and exits on its own.