UIMA in the Biocuration Workflow: A coherent framework for cooperation between biologists and computational linguists
Correspondence: (Login to view email address)
- Barcelona Media Innovation Centre
- Research Unit on Biomedical Informatics, Universitat Pompeu Fabra
PDF (543.8 KB)
- Document Type:
- Poster
- Date:
- Received 24 April 2009 16:05 UTC; Posted 24 April 2009
- Subjects:
- Bioinformatics
- Abstract:
As collaborating partners, Barcelona Media Innovation Centre and GRIB (Universitat Pompeu Fabra) seek to combine strengths from Computational Linguistics and Biomedicine to produce a robust Text Mining system to generate data that will help biocurators in their daily work. The first version of this system will focus on the discovery of relationships between genes, SNPs (Single Nucleotide Polymorphisms) and diseases from the literature.
A first challenge that we were faced with during the setup of this project is the fact that most current tools that support the curation workflow are complex, ad-hoc built applications which sometimes make difficult the interoperability and results sharing between research groups from different and unrelated expert fields. Often, biologists (even computer-savvy ones) are hard pressed to use and adapt sophisticated Natural Language Processing systems, and computational linguists are challenged by the intricacies of biology in applying their processing pipelines to elicit knowledge from texts. The flow of knowledge (needed to develop a usable, practical tool) to and from the parties involved in the development of such systems is not always easy or straightforward.
The modular and versatile architecture of UIMA (Unstructed Information Management Architecture) provides a framework to address these challenges. UIMA is a component architecture and software framework implementation (including a UIMA SDK) to develop applications that analyse large volumes of unstructured information, and has been increasingly adopted by a significant part of the BioNLP community that needs industrial-grade and robust applications to exploit the whole bibliome. The use of UIMA to develop Text Mining applications useful for curation purposes allows the combination of diverse expertises which is beyond the individual know-how of biologists, computer scientists or linguists in isolation. A good synergy and circulation of knowledge between these experts is fundamental to the development of a successful curation tool.
- Collection:
- 3rd International Biocuration Conference
- Presented at:
- 3rd International Biocuration Conference, 16 April 2009
Discussion
- Votes:
-
2 votes
- Comments:
-
0 comments
- (Login to share with a colleague)
Additional information
- License:
- This document is licensed to the public under the Creative Commons Attribution 3.0 License
- How to cite this document:
-
Mellebeek, Bart , Rodriguez-Penagos, Carlos, and Furlong, Laura Ines. UIMA in the Biocuration Workflow: A coherent framework for cooperation between biologists and computational linguists. Available from Nature Precedings <http://dx.doi.org/10.1038/npre.2009.3171.1> (2009)
- Version info:
-
Other versions of this document in Nature Precedings
None.
Other versions of this document elsewhere on the web
None known.