Natural Language Query in the Biochemistry and Molecular Biology Domains Based on Cognition Search™
Correspondence: (Login to view email address)
- University of Texas Southwestern Medical Center at Dallas
- Cognition Technologies Inc. , CA
PDF (136.3 KB)
- Document Type:
- Manuscript
- Date:
- Received 19 September 2008 21:45 UTC; Posted 22 September 2008
- Subjects:
- Bioinformatics
- Abstract:
Motivation: With the tremendous growth in scientific literature, it is necessary to improve upon the standard pattern matching style of the available search engines. Semantic NLP may be the solution to this problem. Cognition Search (CSIR) is a natural language technology. It is best used by asking a simple question that might be answered in textual data being queried, such as MEDLINE. CSIR has a large English dictionary and semantic database. Cognition’s semantic map enables the search process to be based on meaning rather than statistical word pattern matching and, therefore, returns more complete and relevant results. The Cognition Search engine uses downward reasoning and synonymy which also improves recall. It improves precision through phrase parsing and word sense disambiguation.
Result: Here we have carried out several projects to “teach” the CSIR lexicon medical, biochemical and molecular biological language and acronyms from curated web-based free sources. Vocabulary from the Alliance for Cell Signaling (AfCS), the Human Genome Nomenclature Consortium (HGNC), the United Medical Language System (UMLS) Meta-thesaurus, and The International Union of Pure and Applied Chemistry (IUPAC) was introduced into the CSIR dictionary and curated. The resulting system was used to interpret MEDLINE abstracts. Meaning-based search of MEDLINE abstracts yields high precision (estimated at >90%), and high recall (estimated at >90%), where synonym information has been encoded. The present implementation can be found at http://MEDLINE.cognition.com.
Discussion
- Votes:
-
1 vote
- Comments:
-
1 comment
- (Login to share with a colleague)
Additional information
- License:
- This document is licensed to the public under the Creative Commons Attribution 3.0 License
- How to cite this document:
-
Goldsmith, Elizabeth, Mendiratta, Saurabh, Akella, Radha, and Dahlgren, Kathleen. Natural Language Query in the Biochemistry and Molecular Biology Domains Based on Cognition Search™ . Available from Nature Precedings <http://hdl.handle.net/10101/npre.2008.2315.1> (2008)
- Version info:
-
Other versions of this document in Nature Precedings
None.
Other versions of this document elsewhere on the web
None known.
Hariharan Jayaram on 23 September 2008 15:19 UTC
This is a great improvement to Pubmed searching. After the terrible disappointment of “Cuil” and the confusing approaches taken by “Wiki-proteins” this application of semantic search to improve medline querying works and works very well.
Kudos Cognition!