This article is available from: http://www.biomedcentral.com/1471-2148/7/53
[Background] It has been shown in a variety of organisms, including mammals, that genes that
appeared recently in evolution, for example orphan genes, evolve faster than older genes. Low
functional constraints at the time of origin of novel genes may explain these results. However, this
observation has been recently attributed to an artifact caused by the inability of Blast to detect the
fastest genes in different eukaryotic genomes. Distinguishing between these two possible
explanations would be of great importance for any studies dealing with the taxon distribution of
proteins and the origin of novel genes.
[Results] Here we used simulations of protein sequences to examine the capacity of Blast to detect
proteins of diverse evolutionary rates in the different species of an eukaryotic phylogenetic tree
that included metazoans, fungi and plants. We simulated the evolution of protein genes with the
same evolutionary rates than those observed in functional mammalian genes and with among-site
rate heterogeneity. Under these conditions, we found that only a very small percentage of
simulated ancestral eukaryotic proteins was affected by the Blast artifact. We show that the good
detectability of Blast is due to the heterogeneity of protein evolutionary rates at different sites,
since only a small conserved motif in a sequence suffices to detect its homologues. Our results
indicate that Blast, at least when applied within eukaryotes, only misses homologues of extremely
fast-evolving sequences, which are rare in the mammalian genome, as well as sequences evolving
homogeneously or pseudogenes.
[Conclusion] Although great care should be exercised in the recognition of remote homologues,
most functional mammalian genes can be detected in eukaryotic genomes by Blast. That is, the
majority of functional mammalian genes are not as fast as for not being detected in other
metazoans, fungi or plants, if they had been present in these organisms. Thus, the correlation
previously found between age and rate seems not to be due to a pure Blast artifact, at least for
mammals. This may have important implications to understand the mechanisms by which novel
genes originate.
M. M. A and J. C. are supported by grant numbers BIO2002-04426-C02-01
and BIO2002-04426-C02-02, respectively, from the Plan Nacional de Investigación
Científica, Desarrollo e Innovación Tecnológica (I+D+I) of the
MEC, cofinanced with FEDER funds.
Peer reviewed