Poster Abstracts N

Poster Abstracts for Category N: Text mining

Poster N01
Association Rule Mining and Protein Databases: Exceptions Point to Annotation Errors
Irena I. Artamonova (1), Goar Frishman (1), Mikhail S. Gelfand (3), Dmitrij Frishman (1,2)
(1) Institute for Bioinformatics, GSF-National Research Center for Environment and Health; (2) Department of Genome Oriented Bioinformatics, Technische Universitat Munchen; (3) Institute of Information Transmission Problems RAS

Abstract:
The Association Rule Mining technique was applied to identification of possible errors in protein annotations in the Swiss-Prot and PEDANT databases. The errors were identified as feature combinations constituting exceptions from strong rules found by the algorithm. Suspicious features were marked for manual curation, omitted features were added to the annotation. Tests demonstrated that most exceptions indeed were caused by annotation errors.

Contact: irena.artamonova [at] gsf.de

Keywords: Protein Annotation, Association Rule Mining

Poster N02
Finding Optimal Microarray Platform to Analyze High-Risk Pathologies
David Alarcon Gallego (1), Baldomero Oliva (2)
(1) Insitut Municipal d'Investigacions Mediques (IMIM); (2)Universitat Pompeu-Fabra (UPF)

Abstract:
The risk of a population to suffer pathologies as schizophrenia and hypertension is analysed by means of data mining and network analysis of the phenotype in microarrays. This work is aimed to determine the minimal and most significant microarray to detect these pathologies. An initial set of the most important genes involved in these pathologies was obtained by data mining. PIANA included genes interacting with others in the initial set. The resulting set is used to determine the optimal platforms: GPL136 for schizophrenia and GPL549 for hypertension from the GEO database.

Contact: boliva [at] imim.es

Keywords: Microarray, Schizophrenia, Hypertension, GEO

Poster N04
Cross-Discipline Knowledge Integration
Evangelos Pafilis
European Molecular Biology Laboratory, EMBL

Abstract:
The increasing number of publications and the limited scope of literature which scientists follow, result in relevant and potentially important information being lost. By augmenting text mining with biological database information and providing it with web-based corpora, originating from biology, chemistry and medicine, we aim at delivering knowledge associations that would not be visible otherwise. To deal with the variety and heterogeneity of data types and formats we are exposing the software components as services and managing them via workflow engines.

Contact: pafilis [at] embl.de

Keywords: Text Mining, Web Services, Data Integration