(c) Larry Ewing, Simon Budig, Garrett LeSage
Ó 1994 Ç.

Department of Computer Science

PetrSU | Software projects | AMICT | Staff | News archive | Contact | Search

Redescription Mining: Problem, Algorithm and Applications

E. Galbrun, Dr. P. Miettinen (Helsinki Institute for Information Technology, Finland)

Redescription Mining: Problem, Algorithm and Applications. Redescription mining is a powerful data analysis tool that is used to find multiple descriptions of the same entities. Consider geographical regions as an example. They can be characterized by the fauna that inhabits them on one hand and by their meteorological conditions on the other hand. Finding such redescriptors, a task known as niche-finding, is of much importance in biology.

First, we will present our contribution on extending redescription mining to non-Boolean data. Previous redescription mining methods cannot handle other than Boolean data. This restricts the range of possible applications or makes discretization a prerequisite, entailing a possibly harmful loss of information. In niche-finding, for example, while the fauna can be naturally represented using a Boolean presence/absence data, the weather cannot. We propose a simple technique to perform on-the-fly discretization, enabling our algorithm to process real-valued variables.

Then, we will describe current work on finding redescriptions in relational data or graphs and finally outline directions for future research.