Congratulations on the paper. That's some very interesting work! But you would w... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		albertzeyer on May 8, 2024 \| parent \| context \| favorite \| on: xLSTM: Extended Long Short-Term Memory Congratulations on the paper. That's some very interesting work! But you would want to include sLSTM as well to get the best performance, right? How does the speed compares in that case? Specifically when scaling up.

korbip on May 8, 2024 [–]

Thank you! I can say that it is not really a diminishing factor at the scales reported in the paper. So, xLSTM[7:1] is pretty much on par with xLSTM[1:0] in speed. We show that it is helpful on toy tasks, and it shows even better sequence extrapolation performance, so yes.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact