This article is available from: http://www.biomedcentral.com/1471-2105/8/442
[Background] Classification procedures are widely used in phylogenetic inference, the analysis of
expression profiles, the study of biological networks, etc. Many algorithms have been proposed to
establish the similarity between two different classifications of the same elements. However,
methods to determine significant coincidences between hierarchical and non-hierarchical partitions
are still poorly developed, in spite of the fact that the search for such coincidences is implicit in
many analyses of massive data.
[Results] We describe a novel strategy to compare a hierarchical and a dichotomic non-hierarchical
classification of elements, in order to find clusters in a hierarchical tree in which elements of a given
"flat" partition are overrepresented. The key improvement of our strategy respect to previous
methods is using permutation analyses of ranked clusters to determine whether regions of the
dendrograms present a significant enrichment. We show that this method is more sensitive than
previously developed strategies and how it can be applied to several real cases, including microarray
and interactome data. Particularly, we use it to compare a hierarchical representation of the yeast
mitochondrial interactome and a catalogue of known mitochondrial protein complexes,
demonstrating a high level of congruence between those two classifications. We also discuss
extensions of this method to other cases which are conceptually related.
[Conclusion] Our method is highly sensitive and outperforms previously described strategies. A
PERL script that implements it is available at http://www.uv.es/~genomica/treetracker.
Our group is supported by Grant SAF2006-08977 (Ministerio de Educación
y Ciencia [MEC], Spain). A.M. was the recipient of a predoctoral fellowship
from the MEC.
Peer reviewed