A Novel Pruning Algorithm for Smoothing Feedforward Neural Networks Based on Group Lasso Method

In this paper, we propose four new variants of the backpropagation algorithm to improve the generalization ability for feedforward neural networks. The basic idea of these methods stems from the Group Lasso concept which deals with the variable selection problem at the group level. There are two mai...

Full description

Saved in:
Bibliographic Details
Published inIEEE transaction on neural networks and learning systems Vol. 29; no. 5; pp. 2012 - 2024
Main Authors Wang, Jian, Xu, Chen, Yang, Xifeng, Zurada, Jacek M.
Format Journal Article
LanguageEnglish
Published United States IEEE 01.05.2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN2162-237X
2162-2388
2162-2388
DOI10.1109/TNNLS.2017.2748585

Cover

More Information
Summary:In this paper, we propose four new variants of the backpropagation algorithm to improve the generalization ability for feedforward neural networks. The basic idea of these methods stems from the Group Lasso concept which deals with the variable selection problem at the group level. There are two main drawbacks when the Group Lasso penalty has been directly employed during network training. They are numerical oscillations and theoretical challenges in computing the gradients at the origin. To overcome these obstacles, smoothing functions have then been introduced by approximating the Group Lasso penalty. Numerical experiments for classification and regression problems demonstrate that the proposed algorithms perform better than the other three classical penalization methods, Weight Decay, Weight Elimination, and Approximate Smoother , on both generalization and pruning efficiency. In addition, detailed simulations based on a specific data set have been performed to compare with some other common pruning strategies, which verify the advantages of the proposed algorithm. The pruning abilities of the proposed strategy have been investigated in detail for a relatively large data set, MNIST, in terms of various smoothing approximation cases.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2162-237X
2162-2388
2162-2388
DOI:10.1109/TNNLS.2017.2748585