doi:10.1038/npre.2009.3554.2
2 votes

Multiple Ontologies for Integrating Complex Phenotype Datasets

Mary Shimoyama1, Melinda Dwinell1 & Howard Jacob1

Correspondence: (Login to view email address)

  1. Rat Genome Database, Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, Wisconsin
Document Type:
Poster
Date:
Received 05 August 2009 19:08 UTC; Posted 05 August 2009
Subjects:
Genetics & Genomics, Bioinformatics
Tags:
Abstract:

There has been an emergence of multiple large scale phenotyping projects in the rat model organism community as well as renewed interest in the ongoing phenotype data generated by thousands of researchers using hundreds of rat strains worldwide. Unfortunately, this data is scattered and is neither described nor formatted in a standardized manner. A system to integrate complex phenotype data from multiple sources and facilitate data mining and analysis is being developed using multiple ontologies.

Introduction
The potential value of integrating phenotype data from multiple sources (different laboratories, varying techniques to measure similar phenotypes, multiple strains) is enormous. Presented here is a data integration system for complex phenotype data from both large-scale and individual experiments and the taxonomy and ontologies that provide the backbone of this format. RGD along with Mouse Genome Informatics (MGI) (Blake et al, 2009) and the Animal QTL Database (Hu and Reecy, 2007) is developing a Vertebrate Trait Ontology to represent morphological states and physiological processes to be used to annotate quantitative trait loci (QTL) and other data. RGD has also used the Mammalian Phenotype Ontology (Smith et al, 2005) for several years to indicate the relationship of genomic elements to abnormal phenotypes. The Vertebrate Trait Ontology represents what is being assessed, and the Mammalian Phenotype Ontology represents the conclusion that was made. The system presented here represents what was done to measure the trait in order to reach the conclusion. Because of the close relationship among these ontologies, care is being taken to ensure compatibility and similarity in structure using the phenotype properties in the Phenotypic Quality Ontology (PATO) for guidance. (http://www.bioontology.org/wiki/index.php/PATO:Main_Page)

Data Format and Ontologies
Standardization of data types and relationships used to define the phenotype experiment and resulting data, and the ontologies to be used to standardize descriptive fields are being developed. For phenotype data, the major informational components include Researcher, Study, Experiment, Sample, Experimental Conditions and Clinical Measurement. A Rat Strain Taxonomy has been developed to standardize this information and provide the relationships among strains to allow investigators to retrieve and analyze phenotype data for strains that are related genetically. Two important aspects of a phenotype measurement include 1) what was measured and 2) how it was measured. The Clinical Measurement Ontology and the Measurement Method Ontology are being developed to standardize this information. In addition an Experimental Conditions ontology is under construction to allow integration of data measured under various conditions.

Pilot Study Results
Cardiovascular and biochemistry phenotype data from two major datasets have been integrated using the Rat Strain Taxonomy and the three phenotype related ontologies. A prototype data mining tool (http://rgd.mcw.edu/rgdweb/) has also been developed that provides the user with options to begin a search with strains or any of the ontologies and make subsequent filter choices from the other ontologies. Choices presented to the user are restricted to those for which data is available and query tracking functions are provided to alert the user to the number of results being returned and the query choices made.

References
Blake JA, Bult CJ, Eppig JT, Kadin JA, Richardson JE; Mouse Genome Database Group, 2009 Nucleic Acids Res. Jan;37:D712-9.

HuZL, Reecy JM, Animal QTLdb: beyond a repository. A public platform for QTL comparisons and integration with diverse types of structural genomic information, 2007, Mamm Genome, Jan;18(1):1-4.

Smith CL, Goldsmith CA, Eppig JT. The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information, Genome Biol. 2005 6(1):R7.

Collection:
International Conference on Biomedical Ontology
Presented at:
International Conference on Biomedical Ontology, 24 July 2009

Discussion

Votes:

2 votes

(Login to vote)

Comments:

0 comments

(Login to post a comment)

(Login to share with a colleague)

Additional information

License:
This document is licensed to the public under the Creative Commons Attribution 3.0 License
How to cite this document:

Shimoyama, Mary, Dwinell, Melinda, and Jacob, Howard. Multiple Ontologies for Integrating Complex Phenotype Datasets. Available from Nature Precedings <http://dx.doi.org/10.1038/npre.2009.3554.2> (2009)

Version info:

Other versions of this document in Nature Precedings

Version number Document title Date
v1 Posted 05 August 2009

Other versions of this document elsewhere on the web

None known.

Participate

Related Documents

Advertisement