Header menu link for other important links
On high dimensional data spaces
, Roberts S.A.
Published in WITPress
Volume: 6
Pages: 239 - 250
Data mining applications usually encounter high dimensional data spaces. Most of these dimensions contain 'uninteresting' data, which would not only be of little value in terms of discovery of any rules or patterns, but have been shown to mislead some classification algorithms. Since, the computational effort increases very significantly (usually exponentially) in the presence of a large number of attributes, it is highly desirable that all irrelevant attributes be weeded out at an early stage. Often, patterns of interest are embedded in lower dimensional subspaces of data. If the data space S has k attributes ∈ a1,a2...ak, then a n-dimensional subspace sn of the data space S can be formed by selecting a combination of n attributes from the set a1,a 2...ak, where n ≤ k. It is usual to tackle this problem by getting some attributes and subspaces identified by the user (or domain experts). For even moderately large number of attributes, the number of possible subspaces is so large, that it is quite unlikely that the 'experts' would be able to identify all the 'interesting' subspaces.
About the journal
Open AccessNo