Available online at http://www.academicjournals.org/AJBM
ISSN 1993-8233 ©2011 Academic Journals
Full Length Research Paper
A new corporate credit scoring system using semi-supervised discriminant analysis
Shian-Chang Huang
Department of Business Administration, National Changhua University of Education, No.2, Shi-Da Rd., Changhua,
Taiwan. E-mail: shhuang@cc.ncue.edu.tw. Tel: 886-4-7232105-7420.
Accepted 18 July, 2011
Corporate credit scoring is important for investors and banks in risk management. However, the high dimensional data available from public financial statements make credit analysis difficult. To address the problem, dimensionality reduction is a key step to enhance scoring accuracy. By using semi-supervised discriminant analysis (SSDA) and support vector machines (SVMs), this study develops a novel system for credit scoring, where SSDA transforms high dimensional data space (over
50 financial variables) to a perfect low dimensional representative subspace with maximal discriminating power. Constructing SVM classifier in the new space effectively reduces overfitting and enhances classification accuracy. Empirical results indicate that SSDA is better than traditional dimensionality reduction schemes, and it significant improves SVM performance. More importantly, the new classification system substantially outperforms conventional classifiers. The new decision support system can help corporate bond investors make good assessments on their risks and substantially reduce their losses.
Key words: Semi-supervised discriminant analysis, dimensionality reduction, credit scoring, support vector machine, risk management.
INTRODUCTION
Recent financial crisis in 2007-2008 mainly results from credit risk. Corporate credit scoring is important in contemporary risk management. It determines the risk premiums requested by corporate bond investors. The
References: Abdou H, Pointon J, El-Masry A (2008). Neural nets versus conventional techniques in credit scoring in Egyptian banking Belkin M, Niyogi P (2001). Laplacian eigenmaps and spectral techniques for embedding and clustering Belkin M, Niyogi P, Sindhwani V (2006). Manifold regularization: A geometric framework for learning from examples Bellman R (1961). Adaptive Control Processes: A Guided Tour. Cai D, He X, Han J (2007). Semi-Supervised Discriminant Analysis. Proceedings of the Eleventh IEEE International Conference on Computer Vision (ICCV2007). Chen YS, Lin CT, Lu JH (2011). The analytic network process for the banking sector: An approach to evaluate the creditability of emerging Chen WH, Shih JY (2006). A paper of Taiwan 's issuer credit rating systems using support vector machines Cristianini N, Shawe-Taylor J (2000). An Introduction to Support Vector Machines Fukunaga K (1990). Introduction to Statistical Pattern Recognition. Guyon I, Weston J, Barnhill S, Vapnik V (2002). Gene selection for cancer classification using support vector machines Hastie T, Tibshirani R, Friedman J (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction He X, Niyogi P (2003). Locality Preserving Projections. Advances in Neural Information Processing Systems Henley WE, Hand DJ (1997). Construction of a k-nearest neighbour credit-scoring system Huang Z, Chen HC, Hsu CJ, Chen WH, Wu SS (2004). Credit rating analysis with support vector machines and neural networks: A market Hyvärinen A, Karhunen J, Oja E (2001). Independent Component Analysis Lin SL (2010). A two-stage logistic regression-ANN model for the prediction of distress banks: Evidence from 11 emerging countries Mardia KV, Kent JT, Bibby JM (1980). Multivariate Analysis. Academic Press. Scholkopf B, Smola AJ (2002). Learning with Kernels Support Vector Machines, Regularization, Optimization, and Beyond Schoelkopf B, Burges CJC, Smola AJ (1999). Advances in kernel methods - support vector learning Sindhwani V, Niyogi P, Belkin M (2005). Beyond the point cloud: from transductive to semi-supervised learning Stepanova M, Thomas LC (2001). PHAB scores: proportional hazards analysis behavioural scores Tang TC, Chi LC (2005). Neural networks analysis in business failure prediction of Chinese importers: a between-countries approach. Vapnik VN (1999). The nature of statistical learning theory. second edition Weston J, Watkins C (1999). Support Vector Machines for Multi-Class Pattern Recognition Wilcoxon F (1945). Individual comparisons by ranking methods. Yobas MB, Crook JN, Ross P (2000). Credit scoring using neural and evolutionary techniques Zhou D, Bousquet O, Lal T, Weston J, Schölkopf B (2003). Learning with local and global consistency Zhu X, Ghahramani Z, Lafferty J (2003). Semi-supervised learning using gaussian fields and harmonic functions