Say function A calls into generator B, which has entry points X,Y,Z. Maybe you could generate code for functions A<X>, A<Y> and A<Z>, specialised on the entry point into B. When B yields, it actually "calls" the appropriate A, but does something funny with the stack pointer. Maybe?
The generated code would explode if you had too many coroutines going at once, and you need a lot of things to be known at compile time, and maybe the specific things you'd need to do to the stack would slow you down. I don't think you'd have to mess with the return address, though, so maybe it wouldn't be so bad.
Yes of course. At that point any standard optimisation that can be applied to normal code is applicable.
In this case you wouldn't expect any call at all: the switch statement is online in the caller and standard constant propagation can remove the switch itself.
Say function A calls into generator B, which has entry points X,Y,Z. Maybe you could generate code for functions A<X>, A<Y> and A<Z>, specialised on the entry point into B. When B yields, it actually "calls" the appropriate A, but does something funny with the stack pointer. Maybe?
The generated code would explode if you had too many coroutines going at once, and you need a lot of things to be known at compile time, and maybe the specific things you'd need to do to the stack would slow you down. I don't think you'd have to mess with the return address, though, so maybe it wouldn't be so bad.