Header menu link for other important links
Effect of various kernels and feature selection methods on SVM performance for detecting email spams
, Shrawan Kumar Trivedi
Published in Foundation of Computer Science (FCS)
Volume: 66
Issue: 21
Pages: 18 - 23

This Research presents the effects of interaction between various Kernel functions and different Feature Selection Techniques for improving the learning capability of Support Vector Machine (SVM) in detecting email spams. The interaction of four Kernel functions of SVM i.e. “Normalised Polynomial Kernel (NP)”, “Polynomial Kernel (PK)”, “Radial Basis Function Kernel (RBF)”, and “Pearson VII Function-Based Universal Kernel (PUK)” with three feature selection techniques i.e. “Gain Ratio (GR)”, “Chi-Squared (2), and “Latent Semantic Indexing (LSI)” have been tested on the “Enron Email Data Set”. The results reveal some interesting facts regarding the variation of the performance of Kernel functions with the number of features (or dimensions) in the data. NP performs the best across a wide range of dimensionality, for all the feature selection techniques tested. PUK kernel works well with low dimensional data and is the second best in performance (after NP), but shows poor performance for high dimensional data. Latent Semantic Indexing (LSI) appears to be the best amongst all the tested feature selection techniques. However, for high dimensional data, all the feature selection techniques perform almost equally well.

About the journal
JournalInternational Journal of Computer Applications
PublisherFoundation of Computer Science (FCS)
Open AccessYes
Concepts (3)
  •  related image
    Support vector machine (svm)
  •  related image
    Kernel functions
  •  related image
    Feature selection methods.