Básico sobre Deep Learning

Nessa entrevista do Arno Candel da H2o para o Kdnuggets, ele resume o que é o Deep Learning:

[…]Deep Learning methods use a composition of multiple non-linear transformations to model high-level abstractions in data. Multi-layer feed-forward artificial neural networks are some of the oldest and yet most useful such techniques. We are now reaping the benefits of over 60 years of evolution in Deep Learning that began in the late 1950s when the term Machine Learning was coined. Large parts of the growing success of Deep Learning in the past decade can be attributed to Moore’s law and the exponential speedup of computers, but there were also many algorithmic breakthroughs that enabled robust training of deep learners. Compared to more interpretable Machine Learning techniques such as tree-based methods, conventional Deep Learning (using stochastic gradient descentand back-propagation) is a rather “brute-force” method that optimizes lots of coefficients (it is a parametric method) starting from random noise by continuously looking at examples from the training data. It follows the basic idea of “(good) practice makes perfect” (similar to a real brain) without any strong guarantees on the quality of the model. […]

  Neste trecho ele fala de algumas aplicações de Deep Learning:

[…]Deep Learning is really effective at learning non-linear derived featuresfrom the raw input features, unlike standard Machine Learning methods such as linear or tree-based methods. For example, if age and income are the two features used to predict spending, then a linear model would greatly benefit from manually splitting age and income ranges into distinct groups; while a tree-based model would learn to automatically dissect the two-dimensional space. A Deep Learning model builds hierarchies of (hidden) derived non-linear features that get composed to approximate arbitrary functions such as sqrt((age-40)^2+0.3*log(income+1)-4) with much less effort than with other methods. Traditionally, data scientists perform many of these transformations explicitly based on domain knowledge and experience, but Deep Learning has been shown to be extremely effective at coming up with those transformations, often outperforming standard Machine Learning models by a substantial margin. Deep Learning is also very good at predicting high-cardinality class memberships, such as in image or voice recognition problems, or in predicting the best item to recommend to a user. Another strength of Deep Learning is that it can also be used for unsupervised learning where it just learns the intrinsic structure of the data without making predictions (remember the Google cat?). This is useful in cases where there are no training labels, or for various other use cases such as anomaly detection. […]