If you have a specific use case in mind you might as well just test that use case directly. E.g. PTS (Phoronix Test Suite) is not a traditional benchmark, it's a workload tester, and it's right up that alley (though you then run into the problem of "the workload I care to compare doesn't run on an iPhone at all", at which point you either compare generalities again or don't care anymore).