The idea is interesting, but I still don’t understand how this is supposed to so...

		heavymemory 29 days ago \| parent \| context \| favorite \| on: Nested Learning: A new ML paradigm for continual l... The idea is interesting, but I still don’t understand how this is supposed to solve continual learning in practice. You’ve got a frozen transformer and a second module still trained with SGD, so how exactly does that solve forgetting instead of just relocating it?