A low variance error boosting algorithm

Wang, Ching-Wei and Hunter, Andrew (2009) A low variance error boosting algorithm. Applied Intelligence, 33 (3). ISSN 0924-669X

Documents
A low variance error boosting algorithm
This paper introduces a robust variant of AdaBoost, cw-AdaBoost, that uses weight perturbation to reduce variance error, and is particularly effective when dealing with data sets, such as microarray data, which have large numbers of features and small number of instances. The algorithm is compared with AdaBoost, Arcing and MultiBoost, using twelve gene expression datasets, using 10-fold cross validation. The new algorithm consistently achieves higher classification accuracy over all these datasets. In contrast to other AdaBoost variants, the algorithm is not susceptible to problems when a zero-error base classifier is encountered.
[img]
[Download]
[img]
Preview
PDF
fullyboosted.pdf - Whole Document
Available under License Creative Commons Attribution Non-commercial.

731kB

Official URL: http://www.springerlink.com/content/58k87283g737mm...

Abstract

This paper introduces a robust variant of AdaBoost,
cw-AdaBoost, that uses weight perturbation to reduce
variance error, and is particularly effective when dealing with data sets, such as microarray data, which have large numbers of features and small number of instances. The algorithm is compared with AdaBoost, Arcing and MultiBoost, using twelve gene expression
datasets, using 10-fold cross validation. The new algorithm
consistently achieves higher classification accuracy over all these datasets. In contrast to other AdaBoost variants, the algorithm is not susceptible to problems when a zero-error base classifier is encountered.

Item Type:Article
Additional Information:This paper introduces a robust variant of AdaBoost, cw-AdaBoost, that uses weight perturbation to reduce variance error, and is particularly effective when dealing with data sets, such as microarray data, which have large numbers of features and small number of instances. The algorithm is compared with AdaBoost, Arcing and MultiBoost, using twelve gene expression datasets, using 10-fold cross validation. The new algorithm consistently achieves higher classification accuracy over all these datasets. In contrast to other AdaBoost variants, the algorithm is not susceptible to problems when a zero-error base classifier is encountered.
Keywords:Machine Learning, Boosting, Ensemble, Gene Expression Data
Subjects:G Mathematical and Computer Sciences > G700 Artificial Intelligence
Divisions:College of Science > School of Computer Science
ID Code:1842
Deposited By: Bev Jones
Deposited On:16 Mar 2009 12:11
Last Modified:13 Mar 2013 08:31

Repository Staff Only: item control page