Open access
Author
Date
2022Type
- Doctoral Thesis
ETH Bibliography
yes
Altmetrics
Abstract
Computational efficiency is an essential factor that influences the applicability of computer vision algorithms. Although deep neural networks have reached state-of-the-art performances in a variety of computer vision tasks, there are a couple of efficiency related problems of the deep learning based solutions. First, the overparameterization of deep neural networks results in models with millions of parameters, which lowers the parameter efficiency of the designed networks. To store the parameters and intermediate feature maps during the computation, a large device memory footprint is required. Secondly, the massive computation in deep neural networks slows down their training and inference. This limits the application of deep neural networks to latency-demanding scenarios and low-end devices. Thirdly, the massive computation consumes significant amount of energy, which leaves a large carbon footprint of deep learning models.
The aim of this thesis is to improve the computational efficiency of current deep neural networks. This problem is tackled from three perspective including neural network compression, neural architecture optimization, and computational procedure optimization.
In the first part of the thesis, we reduce the model complexity of neural networks by network compression techniques including filter decomposition and filter pruning. The basic assumption for filter decomposition is that the ensemble of filters in deep neural networks constitutes an overcomplete set. Instead of using the original filters directly during the computation, they can be approximated by a linear combination of a set of basis filters. The contribution of this thesis is to provide a unified analysis of previous filter decomposition methods. On the other hand, a differentiable filter pruning method is proposed. To achieve differentiability, the layers of neural networks is reparameterized by a meta network. Sparsity regularization is applied to the input of the meta network, i.e. latent vectors. Optimizing with the introduced regularization leads to an automatic network pruning method. Additionally, a joint analysis of filter decomposition and filter pruning is presented from the perspective of compact tensor approximation. The hinge of the two techniques is the introduced sparsity inducing matrix. By simply changing the way the group sparsity regularization is enforced to the matrix, the two techniques can be derived accordingly.
Secondly, we try to improve the performance of a baseline network by a fine-grained neural architecture optimization method. Different from network compression methods, the aim of this method is to improve the prediction accuracy of neural networks while reducing their model complexity at the same time. Achieving the two targets simultaneously makes the problem more challenging. In addition, a nearly cost-free constraint is enforced during the architecture optimization, which differs from current neural architecture search methods with bulky computation. This can be regarded as another efficiency-improving technique.
Thirdly, we optimize the computational procedure of graph neural networks. By mathematically analyzing the operations in graph neural network, two methods are proposed to improve the computational efficiency. The first method is related to the simplification of neighbor querying in graph neural network while the second involves shuffling the order of graph feature gathering and an feature extraction operations.
To summarize, this thesis contributes to multiple aspects of improving the computational efficiency of neural networks during the optimization, training, and test phase. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000540498Publication status
publishedExternal links
Search print copy at ETH Library
Contributors
Examiner: Van Gool, Luc
Examiner: Brox, Thomas
Examiner: Yang, Ming-Hsuan
Examiner: Timofte, Radu
Publisher
ETH ZurichSubject
Deep neural networks (DNNs); Network pruning; Neural architecture search; Low-rank approximation; Graph Neural Networks (GNNs); Hypernetworks; Network Acceleration; Image classification; Image restoration; Point cloud processingOrganisational unit
02652 - Institut für Bildverarbeitung / Computer Vision Laboratory03514 - Van Gool, Luc (emeritus) / Van Gool, Luc (emeritus)
More
Show all metadata
ETH Bibliography
yes
Altmetrics