Better representations are already used for downstream data analysis and continue to be improved; see e.g. http://arxiv.org/abs/1506.08452 .
However, algorithms for producing variant calls from raw sequencer output are also improving over time, and the way to get the most benefit from them is to save the old raw data so newer algorithms can be applied to them. That's where the storage challenges come into play.
However, algorithms for producing variant calls from raw sequencer output are also improving over time, and the way to get the most benefit from them is to save the old raw data so newer algorithms can be applied to them. That's where the storage challenges come into play.