I think C++ actually has a very similar problem here, the complexity is worse but I'd bet the runtime isn't an issue because, as you said, the compiler should handle it. The specific problem here is that python has multiple ways to do things that seem pretty much equivalent but that have different run times. Another example is for loops, list comprehensions, and map. All of these things iterate over a collection but all have different run times. That does not make much sense imo.
> All of these things iterate over a collection but all have different run times.
List comprehensions return a list, and map returns something, while a for loop doesn’t necessarily return anything. I’d expect the overhead of constructing the returned object has an impact on performance.