DSpace Repository

Discovering semantic features in the literature: a foundation for building functional associations

Show simple item record

dc.creator Chagoyen, Mónica
dc.creator Carmona-Sáez, Pedro
dc.creator Shatkay, Hagit
dc.creator Carazo, José M.
dc.creator Pascual-Montano, Alberto
dc.date 2008-04-07T12:27:09Z
dc.date 2008-04-07T12:27:09Z
dc.date 2006-01-26
dc.date.accessioned 2017-01-31T01:01:36Z
dc.date.available 2017-01-31T01:01:36Z
dc.identifier BMC Bioinformatics 2006, 7:41
dc.identifier 1471-2105
dc.identifier http://hdl.handle.net/10261/3458
dc.identifier 10.1186/1471-2105-7-41
dc.identifier.uri http://dspace.mediu.edu.my:8181/xmlui/handle/10261/3458
dc.description This article is available from: http://www.biomedcentral.com/1471-2105/7/41
dc.description [Background] Experimental techniques such as DNA microarray, serial analysis of gene expression (SAGE) and mass spectrometry proteomics, among others, are generating large amounts of data related to genes and proteins at different levels. As in any other experimental approach, it is necessary to analyze these data in the context of previously known information about the biological entities under study. The literature is a particularly valuable source of information for experiment validation and interpretation. Therefore, the development of automated text mining tools to assist in such interpretation is one of the main challenges in current bioinformatics research.
dc.description [Results] We present a method to create literature profiles for large sets of genes or proteins based on common semantic features extracted from a corpus of relevant documents. These profiles can be used to establish pair-wise similarities among genes, utilized in gene/protein classification or can be even combined with experimental measurements. Semantic features can be used by researchers to facilitate the understanding of the commonalities indicated by experimental results. Our approach is based on non-negative matrix factorization (NMF), a machine-learning algorithm for data analysis, capable of identifying local patterns that characterize a subset of the data. The literature is thus used to establish putative relationships among subsets of genes or proteins and to provide coherent justification for this clustering into subsets. We demonstrate the utility of the method by applying it to two independent and vastly different sets of genes.
dc.description [Conclusion] The presented method can create literature profiles from documents relevant to sets of genes. The representation of genes as additive linear combinations of semantic features allows for the exploration of functional associations as well as for clustering, suggesting a valuable methodology for the validation and interpretation of high-throughput experimental data.
dc.description This work has been partially funded by Santander-UCM (grant PR27/05- 13964), Comunidad Autonoma de Madrid (grant CAM GR/SAL/0653/ 2004), Comision Interministerial de Ciencia y Tecnologia (grants CICYT BFU2004-00217/BMC and GEN2003-20235-c05-05) and a collaborative grant between the Spanish Research Council and the National Research Council of Canada (CSIC-050402040003). PCS is recipient of a grant from Comunidad Autonoma de Madrid. APM acknowledges the support of the Spanish Ramón y Cajal program. HS is supported by the Canadian NSERC Discovery Grant 298292-04.
dc.description Peer reviewed
dc.format 1355397 bytes
dc.format 590431 bytes
dc.format 1105408 bytes
dc.format 31862 bytes
dc.format 24067 bytes
dc.format application/pdf
dc.format application/pdf
dc.format application/vnd.ms-excel
dc.format application/pdf
dc.format application/pdf
dc.language eng
dc.publisher BioMed Central
dc.relation Publisher’s version
dc.rights openAccess
dc.title Discovering semantic features in the literature: a foundation for building functional associations
dc.type Artículo


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account