doi:10.1038/npre.2009.3262.1
1 vote

Integrating Text Mining into the MGI Biocuration Workflow

Karen G. Dowell1, Monica S. McAndrews-Hill2, David P. Hill2, Harold J. Drabkin2 & Judith A. Blake2

Correspondence: (Login to view email address)

  1. University of Maine Graduate School of Biomedical Sciences
  2. Mouse Genome Informatics at The Jackson Laboratory

Download:

PDF (2.7 MB)

Embed:

License:

License Kind
Document Type:
Presentation
Date:
Received 20 May 2009 11:36 UTC; Posted 20 May 2009
Subjects:
Bioinformatics
Tags:
Abstract:

A major challenge for the development of resources for functional and comparative genomics is the extraction of data from the biomedical literature. Although text retrieval and extraction for biological data is an active research field, few applications have been integrated into production literature curation systems such as those of the model organism databases.

In September 2008, Mouse Genome Informatics (MGI) at The Jackson Lab initiated a search for dictionary-based text mining tools that we could integrate into our curation workflow. MGI has rigorous document triage and annotation procedures designed to identify articles about mouse genome biology and determine whether those articles should be curated. We currently screens approximately 1000 journal articles a month for Gene Ontology terms, gene mapping, gene expression, phenotype data and other key biological information. Although we don’t foresee that human curation tasks can be fully automated in the near future, we are eager to implement entity name recognition and gene tagging tools that can help streamline our curation workflow and simplify gene indexing tasks in the MGI system.

In this presentation, we discuss our search process and the steps we took to identify a short list of potential tools for further evaluation. We present our performance metrics and success criteria, and pilot projects in progress. The primary applications under current review are Fraunhofer SCAI’s ProMiner and NCBO’s Open-Biomedical Annotator.

Collection:
3rd International Biocuration Conference
Presented at:
3rd International Biocuration Conference, 17 April 2009

Discussion

Votes:

1 vote

(Login to vote)

Comments:

0 comments

(Login to post a comment)

(Login to share with a colleague)

Additional information

License:
This document is licensed to the public under the Creative Commons Attribution 3.0 License
How to cite this document:

Dowell, Karen, McAndrews-Hill, Monica, Hill, David, Drabkin, Harold, and Blake, Judith. Integrating Text Mining into the MGI Biocuration Workflow. Available from Nature Precedings <http://dx.doi.org/10.1038/npre.2009.3262.1> (2009)

Version info:

Other versions of this document in Nature Precedings

None.

Other versions of this document elsewhere on the web

None known.

Participate

Related Documents

Advertisement