These findings seem to be at odds. The former says that deep linear nets are useful, non-linear and trainable with gradient descent. The latter says that the non-linearity only exists due to quirks in floating point and that evolutionary strategies must be use to find extremely small activations that can exploit the non-linearities in floating point.
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
"We attempt to bridge the gap between the theory and practice of deep learning by systematically analyzing learning dynamics for the restricted case of deep linear neural networks. Despite the linearity of their input-output map, such networks have nonlinear gradient descent dynamics on weights that change with the addition of each new hidden layer. We show that deep linear networks exhibit nonlinear learning phenomena similar to those seen in simulations of nonlinear networks, including long plateaus followed by rapid transitions to lower error solutions, and faster convergence from greedy unsupervised pretraining initial conditions than from random initial conditions."
"Neural networks consist of stacks of a linear layer followed by a nonlinearity like tanh or rectified linear unit. Without the nonlinearity, consecutive linear layers would be in theory mathematically equivalent to a single linear layer. So it’s a surprise that floating point arithmetic is nonlinear enough to yield trainable deep networks."
The arxiv paper here is analyzing the nonlinearities in a network's learning dynamics; exploring why training time / error rates are not do not vary linearly throughout the the training process.
They note:
"Here we provide an exact analytical theory of learning in deep linear neural networks that quantitatively
answers these questions for this restricted setting. Because of its linearity, the input-output map of a deep
linear network can always be rewritten as a shallow network."
Latently (SUS17) also provides a more self-directed path to learning deep learning focused exclusively on implementing research papers and conducting original research: https://github.com/Latently/DeepLearningCertificate
The wording in the linked GitHub page makes it sound like you are looking for people that are already ML practitioners. Are you able to support developers with no ML background that are interested in making a career change?
Latently (SUS17) also provides a more self-directed path to learning deep learning focused exclusively on implementing research papers and conducting original research: https://github.com/Latently/DeepLearningCertificate
That seems interesting, but there are so many papers and little indication of which ones are more important (or which ones to implement first). I realize this is for advanced learners, but some guidelines, or at least a section pointing to survey papers would be really helpful as a starting point.
This is from 1999. Meanwhile, IARPA now conducts studies where intelligence analysts look at various *INT layers of data stacked on top of eachother and cognitive modelers try to recreate the biases they demonstrate in the lab. See:
The Neural Basis of Decision-Making During Sensemaking: Implications for Human-System Interaction
Someday perhaps YC's efforts at UBI will trickle down to HN, resulting in a fair platform that doesn't extremize inequality. In the meantime, us peons are left to hijack the threads of YC Partners.
Free access to ridiculous amounts of hardware and the opportunity to implement important and hot scientific papers and conduct original research in deep learning.