Hacker Newsnew | past | comments | ask | show | jobs | submit | _cmfi's commentslogin

10% is a pittance!


These findings seem to be at odds. The former says that deep linear nets are useful, non-linear and trainable with gradient descent. The latter says that the non-linearity only exists due to quirks in floating point and that evolutionary strategies must be use to find extremely small activations that can exploit the non-linearities in floating point.

Exact solutions to the nonlinear dynamics of learning in deep linear neural networks

https://arxiv.org/abs/1312.6120

"We attempt to bridge the gap between the theory and practice of deep learning by systematically analyzing learning dynamics for the restricted case of deep linear neural networks. Despite the linearity of their input-output map, such networks have nonlinear gradient descent dynamics on weights that change with the addition of each new hidden layer. We show that deep linear networks exhibit nonlinear learning phenomena similar to those seen in simulations of nonlinear networks, including long plateaus followed by rapid transitions to lower error solutions, and faster convergence from greedy unsupervised pretraining initial conditions than from random initial conditions."

Nonlinear Computation in Deep Linear Networks

https://blog.openai.com/nonlinear-computation-in-linear-netw...

"Neural networks consist of stacks of a linear layer followed by a nonlinearity like tanh or rectified linear unit. Without the nonlinearity, consecutive linear layers would be in theory mathematically equivalent to a single linear layer. So it’s a surprise that floating point arithmetic is nonlinear enough to yield trainable deep networks."


The arxiv paper here is analyzing the nonlinearities in a network's learning dynamics; exploring why training time / error rates are not do not vary linearly throughout the the training process.

They note: "Here we provide an exact analytical theory of learning in deep linear neural networks that quantitatively answers these questions for this restricted setting. Because of its linearity, the input-output map of a deep linear network can always be rewritten as a shallow network."


Latently (SUS17) also provides a more self-directed path to learning deep learning focused exclusively on implementing research papers and conducting original research: https://github.com/Latently/DeepLearningCertificate


The wording in the linked GitHub page makes it sound like you are looking for people that are already ML practitioners. Are you able to support developers with no ML background that are interested in making a career change?


Definitely, we help folks pick papers that are appropriate for their skill level.


This seems very interesting, but I'm getting a 404 at this link.


Sounds interesting. What are the costs for the participants?


There are no costs.


It is used for that purpose as well.


Author here. The list is outdated, but that timestamp doesn't reflect the last edit as the list is dynamically generated using Semantic Mediawiki.

Also - a lot of work went into this!


This is an excellent list! Thanks to all who contributed.

When was the last edit? Most of the tools are marked as last released on or before 2015. I am sure many have gotten updated.


Latently (SUS17) also provides a more self-directed path to learning deep learning focused exclusively on implementing research papers and conducting original research: https://github.com/Latently/DeepLearningCertificate


That seems interesting, but there are so many papers and little indication of which ones are more important (or which ones to implement first). I realize this is for advanced learners, but some guidelines, or at least a section pointing to survey papers would be really helpful as a starting point.


We have a bibliography - it is not yet organized very well but we are working on that: https://paperpile.com/shared/UhfbVO


This is from 1999. Meanwhile, IARPA now conducts studies where intelligence analysts look at various *INT layers of data stacked on top of eachother and cognitive modelers try to recreate the biases they demonstrate in the lab. See:

The Neural Basis of Decision-Making During Sensemaking: Implications for Human-System Interaction

https://www.researchgate.net/publication/278679336_The_Neura...


For anyone else stuck in a captcha redirect loop:

https://doi.org/10.1109/AERO.2015.7118968


See the Latently Deep Learning Certificate below. We let anyone who wants to try onboard.


It's fine to tell people about your related project, but it's not fine to hijack the thread with comments about it, so please don't.

We detached this comment from https://news.ycombinator.com/item?id=14852088 and marked it off-topic.


Someday perhaps YC's efforts at UBI will trickle down to HN, resulting in a fair platform that doesn't extremize inequality. In the meantime, us peons are left to hijack the threads of YC Partners.


Every community has a set of rules you need to abide by, it's really not that complicated.


See also: Latently Deep Learning Certificate

https://github.com/Latently/DeepLearningCertificate

Free access to ridiculous amounts of hardware and the opportunity to implement important and hot scientific papers and conduct original research in deep learning.


There's a "submit" link - perhaps you should try that?


I have submitted it numerous times to no effect. This drove a bunch of traffic so seems to be the way to do it.


More info:

Latently, Bitfusion, and IBM Cloud enable democratized access to deep learning

https://www.ibm.com/blogs/bluemix/2017/07/latently-bitfusion...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: