<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/" version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/">
  <channel>
    <title>Nature Precedings - Ricardo Vencio</title>
    <link>http://precedings.nature.com/users/5789d0a3b8158f609276926d7280242a/</link>
    <description>Documents posted by Ricardo Vencio</description>
    <dc:publisher>Nature Publishing Group</dc:publisher>
    <dc:language>en</dc:language>
    <prism:publicationName>Nature Precedings</prism:publicationName>
    <image>
      <title>Nature Precedings</title>
      <url>http://precedings.nature.com/images/header_logo.gif</url>
      <link>http://precedings.nature.com</link>
    </image>
    <atom:link type="application/rss+xml" rel="self" href="http://precedings.nature.com/users/5789d0a3b8158f609276926d7280242a/feed"/>
    <item>
      <title>ProbCD: enrichment analysis accounting for categorization uncertainty</title>
      <link>http://precedings.nature.com/documents/369/version/1</link>
      <description>As in many other areas of science, systems biology makes extensive use of statistical association and significance estimates in contingency tables, a type of categorical data analysis known in this field as enrichment (also over-representation or enhancement) analysis. In spite of efforts to create probabilistic annotations, especially in the Gene Ontology context, or to deal with uncertainty in high throughput-based datasets, current enrichment methods largely ignore this probabilistic information since they are mainly based on variants of the Fisher Exact Test. We developed an open-source R package to deal with probabilistic categorical data analysis, ProbCD, that does not require a static contingency table. The contingency table forthe enrichment problem is built using the expectation of a Bernoulli Scheme stochastic process given the categorization probabilities. An on-line interface was created to allow usage by non-programmers and is available at: http://xerad.systemsbiology.net/ProbCD/. We present an analysis framework and software tools to address the issue of uncertainty in categorical data analysis. In particular, concerning the enrichment analysis, ProbCD can accommodate: (i) the stochastic nature of the high-throughput experimental techniques and (ii) probabilistic gene annotation.</description>
      <guid>http://precedings.nature.com/documents/369/version/1</guid>
      <pubDate>Fri, 06 Jul 2007 04:42:39 UTC</pubDate>
      <dc:title>ProbCD: enrichment analysis accounting for categorization uncertainty</dc:title>
      <dc:identifier>hdl:10101/npre.2007.369.1</dc:identifier>
      <dc:date>2007-07-06</dc:date>
      <dc:creator>Ricardo V&#234;ncio</dc:creator>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2007-07-06T04:42:39Z</prism:publicationDate>
      <prism:category>Manuscript</prism:category>
      <prism:section>Biotechnology</prism:section>
      <prism:section>Genetics &amp; Genomics</prism:section>
      <prism:section>Bioinformatics</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/369/version/1/files/npre2007369-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/2.5/</creativeCommons:license>
    </item>
    <item>
      <title>Simcluster: clustering enumeration gene expression data on the simplex space</title>
      <link>http://dx.doi.org/10.1038/npre.2007.202.1</link>
      <description>Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST &amp;#8220;digital northern&amp;#8221;, are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space.Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster.Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data.</description>
      <guid>http://dx.doi.org/10.1038/npre.2007.202.1</guid>
      <pubDate>Mon, 25 Jun 2007 04:51:48 UTC</pubDate>
      <dc:title>Simcluster: clustering enumeration gene expression data on the simplex space</dc:title>
      <dc:identifier>doi:10.1038/npre.2007.202.1</dc:identifier>
      <dc:date>2007-06-25</dc:date>
      <dc:creator>Ricardo Z. N. V&#234;ncio</dc:creator>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2007-06-25T04:51:48Z</prism:publicationDate>
      <prism:category>Manuscript</prism:category>
      <prism:section>Biotechnology</prism:section>
      <prism:section>Bioinformatics</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/202/version/1/files/npre2007202-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/2.5/</creativeCommons:license>
    </item>
  </channel>
</rss>
