But that's not really beating the sampling theorem.
Sampling is a self-contained symbol-transmission technique that is absolutely signal-agnostic. You can throw anything at it, including band-limited noise with no redundancy. As long as there's nothing in the signal above N/2, it works.
As soon as you include redundancy and priors, you're solving a different problem. Your channel is no longer signal-agnostic, and you can use known information external to the signal to reconstruct it. The advantage is you can use less data, but the disadvantage is that your reconstruction system has to make strong assumptions about the nature of the signal.
If those assumptions are incorrect, reconstruction fails.
Technically you could argue that the N/2 limit is a form of prior, and sampling is a special instance of a more general theory of channel transmission systems where assumptions are made.
The practical difference is that N/2 filtering followed by sampling at N is relatively trivial with a usefully general result. More complex systems can be more powerful for specific applications, but are more brittle and can be harder to construct.
It depends on what you mean by 'beating' the theorem, doesn't it?
If you mean that the theorem is actually wrong, then I agree— the proof works and you can't actually avoid its conclusions given its assumptions.
But I think we can agree that for many (most?) practical symbol-systems in their particular contexts, the signals are actually high redundant as viewed against a particular basis, so slavishly applying the conclusions of the sampling theorem will cause you to miss the possibility of side-stepping its assumptions entirely.
Whether you're really in a position to take advantage of the latent priors in the signal very much depend on the tools at your disposal—obviously you will not be extracting speech with smart priors if all you have is analog filters at your disposal.
Sampling is a self-contained symbol-transmission technique that is absolutely signal-agnostic. You can throw anything at it, including band-limited noise with no redundancy. As long as there's nothing in the signal above N/2, it works.
As soon as you include redundancy and priors, you're solving a different problem. Your channel is no longer signal-agnostic, and you can use known information external to the signal to reconstruct it. The advantage is you can use less data, but the disadvantage is that your reconstruction system has to make strong assumptions about the nature of the signal.
If those assumptions are incorrect, reconstruction fails.
Technically you could argue that the N/2 limit is a form of prior, and sampling is a special instance of a more general theory of channel transmission systems where assumptions are made.
The practical difference is that N/2 filtering followed by sampling at N is relatively trivial with a usefully general result. More complex systems can be more powerful for specific applications, but are more brittle and can be harder to construct.