Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think reverse-mode automatic differentiation is significantly harder to implement than a string-indexed hash table and delimiter splitting, but maybe that's not important when those are such a small part of an interpreter for Perl or even Awk?

How big is a Transformer in TensorFlow in Python?



(I guess I should have pointed out that forward-mode automatic differentiation was only about 100–150 lines of code when I implemented it in http://canonical.org/~kragen/sw/81hacks/autodiffgraph/, depending on where you draw the lines. But gradient descent isn't practical with forward-mode autodiff unless it's with a very small number of independent variables. Also even 100–150 lines of code is significantly bigger than a hash table and delimiter splitting.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: