<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/" version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/">
  <channel>
    <title>Nature Precedings - Tag feed for biocuration</title>
    <link>http://precedings.nature.com/tags/biocuration</link>
    <description>Recently posted documents tagged with 'biocuration'</description>
    <dc:publisher>Nature Publishing Group</dc:publisher>
    <dc:language>en</dc:language>
    <prism:publicationName>Nature Precedings</prism:publicationName>
    <image>
      <title>Nature Precedings</title>
      <url>http://precedings.nature.com/images/header_logo.gif</url>
      <link>http://precedings.nature.com</link>
    </image>
    <atom:link type="application/rss+xml" rel="self" href="http://precedings.nature.com/tags/biocuration/feed"/>
    <item>
      <title>The Relationship between the UniProt Knowledgebase (UniProtKB) and the IntAct Molecular Interaction Databases </title>
      <link>http://dx.doi.org/10.1038/npre.2009.3936.1</link>
      <description>IntAct provides a freely available, open source database system and analysis tools for protein interaction data. All interactions are derived from literature curation or direct user submission and all experimental information relating to binary protein-proteininteractions is entered into the IntAct database by curators, via a web-based editor. Interaction information is added to the SUBUNIT comment and the RP line of the relevant publication within the UniProtKB entry. There may be a single INTERACTION comment present within a UniProtKB entry, which conveys information relevant to binary protein-protein interactions. This is automatically derived from the IntAct database and is updated on a triweekly basis. Interactions can be derived by any appropriate experimental method but must be confirmed by a second interaction if resulting from a single yeast2hybrid experiment. For large-scale experiments, interactions are considered if a high confidence score is assigned by the authors. The INTERACTION line contains a direct link to IntAct that provides detailed information for the experimental support. These lines are not changed manually and any discrepancy is reported to IntAct for updates. There is also a database crossreference line within the UniProtKB entry i.e.: DR IntAct _UniProtKB AC, which directs the user to additional interaction data for that molecule. UniProt is supported by grants from the National Institutes of Health, European Commission, Swiss Federal Government and PATRIC BRC.IntAct is funded by the European Commission under FELICS, contract number 021902 (RII3) within the Research Infrastructure Action of the FP6 &amp;#8220;Structuring the European Research Area&amp;#8221; Programme.</description>
      <guid>http://dx.doi.org/10.1038/npre.2009.3936.1</guid>
      <pubDate>Tue, 10 Nov 2009 15:11:08 UTC</pubDate>
      <dc:title>The Relationship between the UniProt Knowledgebase (UniProtKB) and the IntAct Molecular Interaction Databases </dc:title>
      <dc:identifier>doi:10.1038/npre.2009.3936.1</dc:identifier>
      <dc:date>2009-11-10</dc:date>
      <dc:creator>Yasmin Alam-Faruque</dc:creator>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2009-11-10T15:11:08Z</prism:publicationDate>
      <prism:category>Poster</prism:category>
      <prism:section>Bioinformatics</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/3936/version/1/files/npre20093936-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
    </item>
    <item>
      <title>Normalization and Matching of Chemical Compound Names</title>
      <link>http://dx.doi.org/10.1038/npre.2009.3322.1</link>
      <description>The identification of a chemical compound solely based on its name requires comprehensive chemical knowledge and often extensive searches in chemical databases. However, it is crucial for the integration of biochemical data extracted from the literature, since many publications exclusively describe a compound by its name. We have developed an application which matches synonymic names of chemical compounds and thereby facilitates the bundling of corresponding data referring to the same compound.The tool that we have developed is based on natural language processing (NLP) methods and applies rules to systematically normalize chemical compound names. Matching of synonymous names is achieved by comparison of the normalized name forms. It is capable of normalizing a given name of a chemical compound and matching it against names in (bio-)chemical databases (e.g. SABIO-RK, ChEBI or PubChem), even when there is no exact name-to-name-match. The tool is also able to match a complete list of compound names against these databases which makes it useful for the automatic annotation of chemical data.This normalization and matching of various synonyms of a chemical compound constitutes a platform for the unambiguous identification of compounds described in the literature or in databases.</description>
      <guid>http://dx.doi.org/10.1038/npre.2009.3322.1</guid>
      <pubDate>Fri, 05 Jun 2009 20:06:57 UTC</pubDate>
      <dc:title>Normalization and Matching of Chemical Compound Names</dc:title>
      <dc:identifier>doi:10.1038/npre.2009.3322.1</dc:identifier>
      <dc:date>2009-06-05</dc:date>
      <dc:creator>Martin Golebiewski</dc:creator>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2009-06-05T20:06:57Z</prism:publicationDate>
      <prism:category>Poster</prism:category>
      <prism:section>Chemistry</prism:section>
      <prism:section>Bioinformatics</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/3322/version/1/files/npre20093322-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
    </item>
    <item>
      <title>BrainGrab: Capturing Curator Expertise as Reusable Annotation Rules</title>
      <link>http://dx.doi.org/10.1038/npre.2009.3313.1</link>
      <description>Experienced biocurators can outperform automated systems on specific genes once they determine which pieces of evidence should drive annotation, and which annotations should be spread. The annotation logic may weigh both homology evidence (BLAST matches or HMM hits) and non-homology evidence (neighboring genes, metabolic context, taxonomic group). Unfortunately, the expertise developed to annotate each gene is short-lived, and is mostly lost if the logic driving the annotation is not captured. We report the development of BrainGrab, an interface added to the MANATEE manual annotation tool for prokaryotic genomes. The curator can specify evidence scenarios that should always lead to equivalent annotation for similar genes in similar contexts, and thus create new annotation rules while the expertise is fresh. No special knowledge of programming or protein family construction is required. BrainGrab rules can mix and match evidence types from the large array of existing protein family definitions such as Pfam families, sequence analyses such as SignalP, and contextual clues, that is, the same types of evidence already familiar to experienced biocurators. We have now created an infrastructure for collecting, distributing, interpreting, and applying BrainGrab rules for automated annotation. A rules interpreter combines queries of existing evidence with specified new searches to determine if a rule must fire. If so, the interpreter writes a new piece of rule-based evidence. Once deposited, BrainGrab/RuleBase evidence can provide automated annotation, pathway reconstruction, and even input data for other rules. We demonstrate the system with sets of rules for annotating proteins and pathways of siderophore biosynthesis in human pathogens, for annotating common fusion proteins, and for applying the proper nomenclature to bacterial ribosomal proteins. The chance to harness curatorial expertise for building rules creates a promising avenue for community contributions to improved annotation pipelines.</description>
      <guid>http://dx.doi.org/10.1038/npre.2009.3313.1</guid>
      <pubDate>Wed, 03 Jun 2009 15:52:03 UTC</pubDate>
      <dc:title>BrainGrab: Capturing Curator Expertise as Reusable Annotation Rules</dc:title>
      <dc:identifier>doi:10.1038/npre.2009.3313.1</dc:identifier>
      <dc:date>2009-06-03</dc:date>
      <dc:creator>Daniel H. Haft</dc:creator>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2009-06-03T15:52:03Z</prism:publicationDate>
      <prism:category>Presentation</prism:category>
      <prism:section>Genetics &amp; Genomics</prism:section>
      <prism:section>Microbiology</prism:section>
      <prism:section>Bioinformatics</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/3313/version/1/files/npre20093313-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
    </item>
    <item>
      <title>Digital BioCuration: A Question of Balance</title>
      <link>http://dx.doi.org/10.1038/npre.2009.3257.1</link>
      <description>Curation of biomedical data has come to encompass a broad range of activities and considerations, which include the building of digital archives, making decisions on the relative value and longevity of one dataset vs another, editing data records manually, performing or assessing computational processes over very large sets of data, and grappling with issues of web usability and data standards. People who consider themselves biological curators may range from a single domain expert, who develops a collection which reflects their personal judgments and priorities, to groups of people supporting large, long term public resources such as GenBank, RefSeq, or PubMed, and everything in between. Finding the right balance between objective measures of quality and personal judgment, between computational measures and manual curation, between published results in journals and active curation of databases varies by project but some common themes and considerations recur in our experiences of the past two decades at NCBI.</description>
      <guid>http://dx.doi.org/10.1038/npre.2009.3257.1</guid>
      <pubDate>Tue, 26 May 2009 16:52:13 UTC</pubDate>
      <dc:title>Digital BioCuration: A Question of Balance</dc:title>
      <dc:identifier>doi:10.1038/npre.2009.3257.1</dc:identifier>
      <dc:date>2009-05-26</dc:date>
      <dc:creator>James Ostell</dc:creator>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2009-05-26T16:52:13Z</prism:publicationDate>
      <prism:category>Presentation</prism:category>
      <prism:section>Genetics &amp; Genomics</prism:section>
      <prism:section>Immunology</prism:section>
      <prism:section>Bioinformatics</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/3257/version/1/files/npre20093257-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
    </item>
    <item>
      <title>H-InvDB release 6, a comprehensive annotation resource for human genes and transcripts</title>
      <link>http://dx.doi.org/10.1038/npre.2009.3251.1</link>
      <description>H-Invitational Database (H-InvDB; http://www.h-invitational.jp/) is an integrated database of human genes and transcripts. By extensive analyses of all human transcripts, we provide curated annotations of human genes and transcripts that include gene structures, alternative splicing isoforms, non-coding functional RNAs, protein functions, functional domains, sub-cellular localizations, metabolic pathways, protein 3D structure, genetic polymorphisms, relation with diseases, gene expression profiling, molecular evolutionary features, protein-protein interactions (PPIs) and gene families/groups.  The latest release of H-InvDB (release 6.0) provide annotation for 219,765 human transcripts in 43,159 human gene clusters based on human FLcDNAs and mRNAs.H-InvDB consists of two main views, the Transcript view and the Locus view, and six auxiliary databases with web-based viewers; G-integra, H-ANGEL, DiseaseInfo Viewer, Evola, PPI view and Gene Family/Group view.  We also provides several data mining tools such as &#8220;Navi search&#8221;: consists of 16 search contents each of which includes items for the search condition (http://www.h-invitational.jp/hinv/c-search/hinvNaviTop.jsp), &#8220;PANDA&#8221;: Priority ANalysis for Disease Association (PANDA) system (http://www.h-invitational.jp/panda/app), H-InvDB now provides web service APIs of SOAP and REST to use H-InvDB data in programs. (http://www.h-invitational.jp/hinv/hws/doc/)</description>
      <guid>http://dx.doi.org/10.1038/npre.2009.3251.1</guid>
      <pubDate>Thu, 14 May 2009 21:30:10 UTC</pubDate>
      <dc:title>H-InvDB release 6, a comprehensive annotation resource for human genes and transcripts</dc:title>
      <dc:identifier>doi:10.1038/npre.2009.3251.1</dc:identifier>
      <dc:date>2009-05-14</dc:date>
      <dc:creator>Chisato Yamasaki</dc:creator>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2009-05-14T21:30:10Z</prism:publicationDate>
      <prism:category>Poster</prism:category>
      <prism:section>Genetics &amp; Genomics</prism:section>
      <prism:section>Bioinformatics</prism:section>
      <prism:section>Evolutionary Biology</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/3251/version/1/files/npre20093251-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
    </item>
    <item>
      <title>Studying Biocuration Workflows</title>
      <link>http://dx.doi.org/10.1038/npre.2009.3249.1</link>
      <description>As the first phase of a knowledge engineering study of biocuration workflows, we performed a preliminary task-modeling exercise on seven separate bioinformatics systems. This involved constructing UML activity diagrams from detailed interviews with curators in order to understand the organization of the process the biocurators used to populate their system. The objective of this work was to identify common patterns within the workflows where we might apply text mining methods to accelerate curation. We compiled a number of workflows in a common format but were largely unable to consolidate these structures into a formal structure that facilitated comparison across workflows. At present, more work is needed to perform this task. </description>
      <guid>http://dx.doi.org/10.1038/npre.2009.3249.1</guid>
      <pubDate>Thu, 14 May 2009 21:28:32 UTC</pubDate>
      <dc:title>Studying Biocuration Workflows</dc:title>
      <dc:identifier>doi:10.1038/npre.2009.3249.1</dc:identifier>
      <dc:date>2009-05-14</dc:date>
      <dc:creator>Gully A. P. C. Burns</dc:creator>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2009-05-14T21:28:32Z</prism:publicationDate>
      <prism:category>Presentation</prism:category>
      <prism:section>Bioinformatics</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/3249/version/1/files/npre20093249-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
    </item>
    <item>
      <title>Biocuration Workflow Catalogue</title>
      <link>http://dx.doi.org/10.1038/npre.2009.3250.1</link>
      <description>As the first phase of a knowledge engineering study of biocuration workflows, we performed a preliminary task-modeling exercise on seven separate bioinformatics systems. This involved constructing UML activity diagrams from detailed interviews with curators in order to understand the organization of the process the biocurators used to populate their system. The objective of this work was to identify common patterns within the workflows where we might apply text mining methods to accelerate curation. We compiled a number of workflows in a common format but were largely unable to consolidate these structures into a formal structure that facilitated comparison across workflows. We presented this work as a slideshow and publish this account of the catalog as supplementary information.</description>
      <guid>http://dx.doi.org/10.1038/npre.2009.3250.1</guid>
      <pubDate>Wed, 13 May 2009 19:12:22 UTC</pubDate>
      <dc:title>Biocuration Workflow Catalogue</dc:title>
      <dc:identifier>doi:10.1038/npre.2009.3250.1</dc:identifier>
      <dc:date>2009-05-13</dc:date>
      <dc:creator>Gully Burns</dc:creator>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2009-05-13T19:12:22Z</prism:publicationDate>
      <prism:category>Manuscript</prism:category>
      <prism:section>Bioinformatics</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/3250/version/1/files/npre20093250-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
    </item>
    <item>
      <title>Data Curation in Biology &amp;#8211; Past, Present and Future</title>
      <link>http://dx.doi.org/10.1038/npre.2009.3225.1</link>
      <description>Data curation has been critical in the development of biology from Darwin and Linnaeus to UniProt, the careful collection and organisation of data has been the spring from which new hypotheses and understanding have emerged. In this presentation, I will describe how we have used data curation in my own research group &amp;#8211; and also present an overview of curation at the EBI. With new technical developments and the move towards the semantic web, the role of curation in the future needs to develop to take advantage of these new opportunities. This will be discussed.</description>
      <guid>http://dx.doi.org/10.1038/npre.2009.3225.1</guid>
      <pubDate>Tue, 12 May 2009 13:19:11 UTC</pubDate>
      <dc:title>Data Curation in Biology &amp;#8211; Past, Present and Future</dc:title>
      <dc:identifier>doi:10.1038/npre.2009.3225.1</dc:identifier>
      <dc:date>2009-05-12</dc:date>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2009-05-12T13:19:11Z</prism:publicationDate>
      <prism:category>Presentation</prism:category>
      <prism:section>Genetics &amp; Genomics</prism:section>
      <prism:section>Molecular Cell Biology</prism:section>
      <prism:section>Bioinformatics</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/3225/version/1/files/npre20093225-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
    </item>
    <item>
      <title>Automatisation in UniProtKB / Swiss-Prot Annotation: New Rules and Tools</title>
      <link>http://dx.doi.org/10.1038/npre.2009.3215.1</link>
      <description>The development of next generation sequencing technologies promises a massive increase in the rate of submission of new protein sequences to sequence databases such as the Universal Protein Resource Knowledge Base, UniProtKB. At UniProtKB/Swiss-Prot we propose to meet this challenge by continuing to expand and develop systems for the automatic propagation of existing annotation to newly submitted protein sequences. These developments will promote the standardization of ortholog annotation both across and within kingdoms and significantly enhance our ability to accurately annotate new protein sequences which are being produced at an ever increasing rate.</description>
      <guid>http://dx.doi.org/10.1038/npre.2009.3215.1</guid>
      <pubDate>Fri, 08 May 2009 15:41:19 UTC</pubDate>
      <dc:title>Automatisation in UniProtKB / Swiss-Prot Annotation: New Rules and Tools</dc:title>
      <dc:identifier>doi:10.1038/npre.2009.3215.1</dc:identifier>
      <dc:date>2009-05-08</dc:date>
      <dc:creator>Alan Bridge</dc:creator>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2009-05-08T15:41:19Z</prism:publicationDate>
      <prism:category>Poster</prism:category>
      <prism:section>Bioinformatics</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/3215/version/1/files/npre20093215-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
    </item>
    <item>
      <title>Standardization in UniProtKB/Swiss-Prot</title>
      <link>http://dx.doi.org/10.1038/npre.2009.3214.1</link>
      <description>Within the UniProt consortium, the UniProtKB/Swiss-Prot knowledge base provides the international community with a stable, comprehensive, fully classified, richly and accurately annotated protein sequence database that is fully operable with other databases. Annotation relates to function(s) of the protein (their catalytic activity and the corresponding metabolic pathway(s) in which the protein may be involved), their cellular location, their interactions with other cellular components, etc. It is challenging to unify the way we annotate proteins, to ensure consistency and to describe data unambiguously. It is also highly valuable both for querying the database and for analyzing high-throughput data (expression data for instance). Because it is of fundamental importance to use standardized nomenclatures, annotations in UniProtKB/Swiss-Prot are progressively moving towards controlled vocabularies (CVs) and ontologies. Controlled vocabulary &amp;#8211; or terminology &amp;#8211; provides a list of concepts and text descriptions of their meaning. Concepts in a CV are often organized in a hierarchy. Ontology provides a formal representation of knowledge with definitions of concepts, their attributes and relations between them.As an illustration, we will describe the processes used to produce SUBCELLULAR and PATHWAY annotation sections in UniProtKB/Swiss-Prot. The CVs used in these two sections are based on in-house resources, UniProt subcell1 and UniPathway2 respectively. The links between these resources and other existing resources will be presented too, with a specific focus on Gene Ontology3 as we envisage using it extensively in order to describe protein functions or other biological processes.</description>
      <guid>http://dx.doi.org/10.1038/npre.2009.3214.1</guid>
      <pubDate>Fri, 08 May 2009 14:29:59 UTC</pubDate>
      <dc:title>Standardization in UniProtKB/Swiss-Prot</dc:title>
      <dc:identifier>doi:10.1038/npre.2009.3214.1</dc:identifier>
      <dc:date>2009-05-08</dc:date>
      <dc:creator>Serenella Ferro Rojas</dc:creator>
      <prism:publicationName>Nature Precedings</prism:publicationName>
      <prism:publicationDate>2009-05-08T14:29:59Z</prism:publicationDate>
      <prism:category>Poster</prism:category>
      <prism:section>Bioinformatics</prism:section>
      <media:thumbnail url="http://precedings.nature.com/documents/3214/version/1/files/npre20093214-1.pdf.thumb.png"/>
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
    </item>
  </channel>
</rss>
