Computer Science, Nehru Memorial College, Puthanampatti
Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method using information gain and symmetric uncertainty. The proposed work uses median based discretization for converting the quantitative features into qualitative one, information gain in finding the relevant features and symmetric uncertainty to remove the redundant features. As the proposed work uses both relevance and redundant analyses the predictive accuracy of the Naive Bayesian classifier has been improved. Further the efficiency and effectiveness of the proposed methodology is analyzed by comparing with other existing methods using real-world datasets of high dimensionality.