Progressive Neural Architecture Search

AbstractWe propose a new method for learning the structure of convolutional neural networks (_CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. Our approach uses a sequential model-based optimization (SMBO) strategy, in which we search for structures in order of increasing complexity, while simultaneously learning a surrogate model to guide the search through structure space. Direct comparison under the same search space shows that our method is up to 5 times more efficient than the RL method of Zoph et al. (2018) in terms of_ number of models evaluated, and 8 times faster in terms of total compute. The structures we discover in this way achieve state of the art classification accuracies on CIFAR-10 and ImageNet.

Conclusions: The main contribution of this work is to show how we can accelerate the search for good CNN structures by using progressive search through the space of increasingly complex graphs, combined with a learned prediction function to efficiently identify the most promising models to explore. The resulting models achieve the same level of performance as previous work but with a fraction of the computational cost. There are many possible directions for future work, including: the use of better surrogate predictors, such as Gaussian processes with string kernels; the use of model-based early stopping, such as [3], so we can stop the training of “unpromising” models before reaching E1 epochs; the use of “warm starting”, to initialize the training of a larger b+ 1-sized model from its smaller parent; the use of Bayesian optimization, in which we use an acquisition function, such as expected improvement or upper confidence bound, to rank the candidate models, rather than greedily picking the top K (see e.g., [31,30]); adaptively varying the number of models K evaluated at each step (e.g., reducing it over time); the automatic exploration of speed-accuracy tradeoffs (cf., [11]), etc.