In 2018, I co-wrote a blog post with the inflammatory title “Don’t use TensorFlow, try PyTorch instead” (https://news.ycombinator.com/item?id=17415321).
As it gained traction here, it was changed to “Keras vs PyTorch” (some edgy things that work for a private blog are not good for a corporate one). Yet the initial title stuck, and you can see it resonated well with the crowd.
TensorFlow (while a huge step on top of Theano) had issues with a strange API, mixing needlessly complex parts (even for the simplest layers) with magic-box-like optimization.
There was Keras, which I liked and used before it was cool (when it still supported the Theano backend), and it was the right decision for TF to incorporate it as the default API. But it was 1–2 years too late.
At the same time, I initially looked at PyTorch as some intern’s summer project porting from Lua to Python. I expected an imitation of the original Torch.
Yet the more it developed, the better it was, with (at least to my mind) the perfect level of abstraction. On the one hand, you can easily add two tensors, as if it were NumPy (and print its values in Python, which was impossible with TF at that time). On the other hand, you can wrap anything (from just a simple operation to a huge network) in an nn.Module. So it offered this natural hierarchical approach to deep learning. It offered building blocks that can be easily created, composed, debugged, and reused. It offered a natural way of picking the abstraction level you want to work with, so it worked well for industry and experimentation with novel architectures.
Actually, I opened “Teaching deep learning” and smiled as I saw how it evolved:
> There is a handful of popular deep learning libraries, including TensorFlow, Theano, Torch and Caffe. Each of them has Python interface (now also for Torch: PyTorch)
> [...]
> EDIT (July 2017): If you want a low-level framework, PyTorch may be the best way to start. It combines relatively brief and readable code (almost like Keras) but at the same time gives low-level access to all features (actually, more than TensorFlow).
> EDIT (June 2018): In Keras or PyTorch as your first deep learning framework I discuss pros and cons of starting learning deep learning with each of them.
TensorFlow (while a huge step on top of Theano) had issues with a strange API, mixing needlessly complex parts (even for the simplest layers) with magic-box-like optimization.
There was Keras, which I liked and used before it was cool (when it still supported the Theano backend), and it was the right decision for TF to incorporate it as the default API. But it was 1–2 years too late.
At the same time, I initially looked at PyTorch as some intern’s summer project porting from Lua to Python. I expected an imitation of the original Torch. Yet the more it developed, the better it was, with (at least to my mind) the perfect level of abstraction. On the one hand, you can easily add two tensors, as if it were NumPy (and print its values in Python, which was impossible with TF at that time). On the other hand, you can wrap anything (from just a simple operation to a huge network) in an nn.Module. So it offered this natural hierarchical approach to deep learning. It offered building blocks that can be easily created, composed, debugged, and reused. It offered a natural way of picking the abstraction level you want to work with, so it worked well for industry and experimentation with novel architectures.
So, while in 2016–2017 I was using Keras as the go-to for deep learning (https://p.migdal.pl/blog/2017/04/teaching-deep-learning/), in 2018 I saw the light of PyTorch and didn’t feel a need to look back. In 2019, even for the intro, I used PyTorch (https://github.com/stared/thinking-in-tensors-writing-in-pyt...).