hdl:10101/npre.2007.369.1
0 votes

ProbCD: enrichment analysis accounting for categorization uncertainty

Ricardo Vêncio1 & Ilya Shmulevich1

Correspondence: (Login to view email address)

  1. Institute for Systems Biology

This manuscript is a preprint. A published version is available at:

10.1186/1471-2105-8-383 (Peer Reviewed) Published in BMC Bioinformatics 2007, 8:383.
Document Type:
Manuscript
Date:
Received 06 July 2007 04:45 UTC; Posted 06 July 2007
Subjects:
Biotechnology, Genetics & Genomics, Bioinformatics
Tags:
Abstract:

As in many other areas of science, systems biology makes extensive use of statistical association and significance estimates in contingency tables, a type of categorical data analysis known in this field as enrichment (also over-representation or enhancement) analysis. In spite of efforts to create probabilistic annotations, especially in the Gene Ontology context, or to deal with uncertainty in high throughput-based datasets, current enrichment methods largely ignore this probabilistic information since they are mainly based on variants of the Fisher Exact Test. We developed an open-source R package to deal with probabilistic categorical data analysis, ProbCD, that does not require a static contingency table. The contingency table for
the enrichment problem is built using the expectation of a Bernoulli Scheme stochastic process given the categorization probabilities. An on-line interface was created to allow usage by non-programmers and is available at: http://xerad.systemsbiology.net/ProbCD/. We present an analysis framework and software tools to address the issue of uncertainty in categorical data analysis. In particular, concerning the enrichment analysis, ProbCD can accommodate: (i) the stochastic nature of the high-throughput experimental techniques and (ii) probabilistic gene annotation.

Discussion

Votes:

0 votes

(Login to vote)

Comments:

0 comments

(Login to post a comment)

(Login to share with a colleague)

Additional information

License:
This document is licensed to the public under the Creative Commons Attribution 2.5 License
How to cite this document:

Vêncio, Ricardo and Shmulevich, Ilya. ProbCD: enrichment analysis accounting for categorization uncertainty. Available from Nature Precedings <http://hdl.handle.net/10101/npre.2007.369.1> (2007)

Version info:

Published version:

10.1186/1471-2105-8-383 (Peer Reviewed) Published in BMC Bioinformatics 2007, 8:383.

Other versions of this document in Nature Precedings

None.

Other versions of this document elsewhere on the web

Participate

Related Documents

Advertisement