TY - GEN
T1 - Group Learning for High-Dimensional Sparse Data
AU - Cherkassky, Vladimir
AU - Chen, Hsiang Han
AU - Shiao, Han Tai
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - We describe new methodology for supervised learning with sparse data, i.e., when the number of input features is (much) larger than the number of training samples (n). Under the proposed approach, all available (d) input features are split into several (t) subsets, effectively resulting in a larger number (t*n) of labeled training samples in lower-dimensional input space (of dimensionality d/t). This (modified) training data is then used to estimate a classifier for making predictions in lower-dimensional space. In this paper, standard SVM is used for training a classifier. During testing (prediction), a group of t predictions made by SVM classifier needs to be combined via intelligent post-processing rules, in order to make a prediction for a test input (in the original d-dimensional space). The novelty of our approach is in the design and empirical validation of these post-processing rules under Group Learning setting. We demonstrate that such post-processing rules effectively reflect general (common-sense) a priori knowledge (about application data). Specifically, we propose two different post-processing schemes and demonstrate their effectiveness for two real-life application domains, i.e., handwritten digit recognition and seizure prediction from iEEG signal.
AB - We describe new methodology for supervised learning with sparse data, i.e., when the number of input features is (much) larger than the number of training samples (n). Under the proposed approach, all available (d) input features are split into several (t) subsets, effectively resulting in a larger number (t*n) of labeled training samples in lower-dimensional input space (of dimensionality d/t). This (modified) training data is then used to estimate a classifier for making predictions in lower-dimensional space. In this paper, standard SVM is used for training a classifier. During testing (prediction), a group of t predictions made by SVM classifier needs to be combined via intelligent post-processing rules, in order to make a prediction for a test input (in the original d-dimensional space). The novelty of our approach is in the design and empirical validation of these post-processing rules under Group Learning setting. We demonstrate that such post-processing rules effectively reflect general (common-sense) a priori knowledge (about application data). Specifically, we propose two different post-processing schemes and demonstrate their effectiveness for two real-life application domains, i.e., handwritten digit recognition and seizure prediction from iEEG signal.
KW - Group Learning
KW - SVM
KW - binary classification
KW - digit recognition
KW - feature selection
KW - histogram of projections
KW - iEEG
KW - seizure prediction
KW - unbalanced data.
UR - http://www.scopus.com/inward/record.url?scp=85073220194&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85073220194&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2019.8852183
DO - 10.1109/IJCNN.2019.8852183
M3 - Conference contribution
AN - SCOPUS:85073220194
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2019 International Joint Conference on Neural Networks, IJCNN 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 International Joint Conference on Neural Networks, IJCNN 2019
Y2 - 14 July 2019 through 19 July 2019
ER -