it's a speculation, but I think it's similar with processors = nobody guarantees the code will run the way you set it up. You may want to use some specific register but if the processor will think it has another register that can fulfill the task, it'll use that but tell you that your code is executed as expected. Maybe the internal gpu processor of amd can sufficiently simulate the behavior of nvidia hardware so that higher abstractions will be unaware that something different is happening under the hood
Prettymuch. Compilers can do a lot more than people give them credit for. At least AMD document their hardware so it is actually possible to know low-level details. PTX can obfuscate that surprisingly badly for nvidia targets.