The real issue is studding programmers is expensive. In class assignments or contests you can find a huge range of time to completion. But, paying a statistically significant number of programmers to solve the same task is prohibitively expensive let alone a wide range of identical tasks.
PS: I have completed the same task as other programmers in less than 10% of the time. However, assessing overall productivity requires that someone is consistently faster on a wide range of tasks not just that they happen to solve something in 10% or even 1% of the time.
I dunno, introductory level programming classes create an opportunity to observe anywhere from 5 to 200 people solving the same problem.
I've never taught a CS course, but back in my college days I helped a friend with his CS homework by taking advantage of weak file permissions to copy other people's homework. The shocking thing was that, looking at a small sample, most of the programs didn't work correctly. To complete the assignment our way, we had to fix bugs in programs that other people wrote, which is fantastic preparation for a professional programmer ;-)
But using introductory programming courses as an example of writing real-world code would be a huge mistake. Most of those assignments in my day were almost trivially easy for someone that actually understands the material (and almost impossible by those who don't). It would not be a fair assessment of software engineering productivity.
Worse, the homework assignments are generally completed outside of class under widely ranging conditions. (One student finds a quiet spot in a library, another is in a cramped dorm room with a roommate practicing trumpet, and another goes out and parties until dawn and tries to crank out a project three hours before it's due.)
In your attempt to weed out conflating variables, you introduce a similar variable... namely: all programmers working in the same conditions definitely reduces external stimuli which could impact performance, however it introduces a different variable in which the some (or many!) of the programmers are no longer working under their preferred conditions dropping their performance.
How much of your speed in completing a task based on prior experience with similar or identical problems with your experience with the programming environment being similar to your peers?
PS: I have completed the same task as other programmers in less than 10% of the time. However, assessing overall productivity requires that someone is consistently faster on a wide range of tasks not just that they happen to solve something in 10% or even 1% of the time.