Strongly Agree or Strongly Disagree?: Rating Features in Support Vector Machines

E. Carrizosa, A. Nogales-Gómez, D. Romero Morales

In linear classifiers, such as the Support Vector Machine (SVM), a score is associated with each feature and objects are assigned to classes based on the linear combination of the scores and the values of the features. The role each feature plays in the classifier is related to the magnitude of the corresponding score, while the sign gives information on how the feature points towards a given class. Inspired by discrete psychometric scales, which measure the extent to which a factor is in agreement with a statement, we propose the Discrete Level Support Vector Machine (DILSVM) where the feature scores can only take on a discrete number of levels. The DILSVM classifier benefits from interpretability as it can be seen as a collection of Likert scales, one for each feature, where we rate the level of agreement with the positive class. To build the DILSVM classifier, we propose a Mixed Integer Linear Programming formulation. Our computational results compare the SVM and the DILSVM classifiers using ten real-life datasets, showing that the 3-point and 5-point DILSVM classifiers have comparable accuracy to the SVM, with a substantial gain in interpretability and sparsity.

Key words: Support Vector Machines, Mixed Integer Linear Programming, Likert scale, interpretability, feature rating level

go to main page