Stanford InfoLab Publication Server

Toward Optimal Feature Selection

Koller, Daphne and Sahami, Mehran (1996) Toward Optimal Feature Selection. Technical Report. Stanford InfoLab.




In this paper, we examine a method for feature subset selection based on Information Theory. Initially, a framework for defining the theoretically optimal, but computationally intractable, method for feature subset selection is presented. We show that our goal should be to eliminate a feature if it gives us little or no additional information beyond that subsumed by the remaining features. In particular, this will be the case for both irrelevant and redundant features. We then give an efficient algorithm for feature selection which computes an approximation to the optimal feature selection criterion. The conditions under which the approximate algorithm is successful are examined. Empirical results are given on a number of data sets, showing that the algorithm effectively handles datasets with a very large number of features.

Item Type:Techreport (Technical Report)
Additional Information:Previous number = SIDL-WP-1996-0032
Subjects:Computer Science > Digital Libraries
Projects:Digital Libraries
Related URLs:Project Homepage
ID Code:208
Deposited By:Import Account
Deposited On:28 Oct 2001 16:00
Last Modified:09 Dec 2008 08:40

Download statistics

Repository Staff Only: item control page