Failures in the Deep Learning Approach: Architectures and Meta-Parameterization

[](http://flavioclesio.com/wp-content/uploads/2017/03/arc.jpg) The biggest challenge currently faced by the industry regarding Deep Learning is undoubtedly in the computational aspect, where the entire market is absorbing cloud services for increasingly complex calculations as well as investing in GPU computing power. However, even with hardware today being a commodity, academia is solving a problem that can revolutionize how Deep Learning is done, which is in the architectural/parameterization aspect. This comment from a thread says a lot about this problem, where the user states:

“The main problem I see with Deep Learning: too many parameters.

When you have to find the best value for the parameters, that’s a gradient search by itself. The curse of meta-dimensionality.”

In other words, even with all the hardware availability, the question of what is the best architectural arrangement for a deep neural network? is still not resolved. This paper by Shai Shalev-Shwartz, Ohad Shamir, and Shaked Shammah called “Failures of Deep Learning” exposes this problem quite richly, including experiments (this is the Git repository). The authors state that the failure points of Deep Learning networks are a) lack of gradient-based methods for parameter optimization, b) structural problems in Deep Learning algorithms in problem decomposition, c) architecture, and d) saturation of activation functions. In other words, what might be happening in a large part of Deep Learning applications is that the convergence time could be much shorter if these aspects were already resolved. With this resolved, a large part of what we know today as the hardware industry for Deep Learning networks would either be extremely underutilized (i.e., given that there will be an improvement from the architectural/algorithmic optimization point of view) or could be used for more complex tasks (e.g., like image recognition with a low number of pixels). Thus, even by adopting a hardware-based methodology as the industry has been doing, there is still much room for optimization regarding Deep Learning networks from an architectural and algorithmic perspective. Below is a list of references directly from Stack Exchange for those who want to delve deeper into the subject: Neuro-Evolutionary Algorithms

Reinforcement Learning:

Miscellaneous:

P.S.: Wordpress removed the justify text option, so apologies in advance for the amateur appearance of the blog in the coming days.