doi:10.1038/npre.2009.3171.1
2 votes

UIMA in the Biocuration Workflow: A coherent framework for cooperation between biologists and computational linguists

Bart Mellebeek1, Carlos Rodriguez-Penagos1 & Laura Ines Furlong2

Correspondence: (Login to view email address)

  1. Barcelona Media Innovation Centre
  2. Research Unit on Biomedical Informatics, Universitat Pompeu Fabra
Document Type:
Poster
Date:
Received 24 April 2009 16:05 UTC; Posted 24 April 2009
Subjects:
Bioinformatics
Tags:
Abstract:

As collaborating partners, Barcelona Media Innovation Centre and GRIB (Universitat Pompeu Fabra) seek to combine strengths from Computational Linguistics and Biomedicine to produce a robust Text Mining system to generate data that will help biocurators in their daily work. The first version of this system will focus on the discovery of relationships between genes, SNPs (Single Nucleotide Polymorphisms) and diseases from the literature.

A first challenge that we were faced with during the setup of this project is the fact that most current tools that support the curation workflow are complex, ad-hoc built applications which sometimes make difficult the interoperability and results sharing between research groups from different and unrelated expert fields. Often, biologists (even computer-savvy ones) are hard pressed to use and adapt sophisticated Natural Language Processing systems, and computational linguists are challenged by the intricacies of biology in applying their processing pipelines to elicit knowledge from texts. The flow of knowledge (needed to develop a usable, practical tool) to and from the parties involved in the development of such systems is not always easy or straightforward.

The modular and versatile architecture of UIMA (Unstructed Information Management Architecture) provides a framework to address these challenges. UIMA is a component architecture and software framework implementation (including a UIMA SDK) to develop applications that analyse large volumes of unstructured information, and has been increasingly adopted by a significant part of the BioNLP community that needs industrial-grade and robust applications to exploit the whole bibliome. The use of UIMA to develop Text Mining applications useful for curation purposes allows the combination of diverse expertises which is beyond the individual know-how of biologists, computer scientists or linguists in isolation. A good synergy and circulation of knowledge between these experts is fundamental to the development of a successful curation tool.

Collection:
3rd International Biocuration Conference
Presented at:
3rd International Biocuration Conference, 16 April 2009

Discussion

Votes:

2 votes

(Login to vote)

Comments:

0 comments

(Login to post a comment)

(Login to share with a colleague)

Additional information

License:
This document is licensed to the public under the Creative Commons Attribution 3.0 License
How to cite this document:

Mellebeek, Bart , Rodriguez-Penagos, Carlos, and Furlong, Laura Ines. UIMA in the Biocuration Workflow: A coherent framework for cooperation between biologists and computational linguists. Available from Nature Precedings <http://dx.doi.org/10.1038/npre.2009.3171.1> (2009)

Version info:

Other versions of this document in Nature Precedings

None.

Other versions of this document elsewhere on the web

None known.

Participate

Related Documents

Advertisement