background image
www.incf.org
1st INCF Workshop
on
Mouse and Rat Brain
Digital Atlasing Systems
February 13­14, 2007 - Stockholm, Sweden
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ ]
1st INCF Workshop on Mouse and Rat Brain Digital Atlasing Systems
February 13­14, 2007
International Neuroinformatics Coordinating Facility Secretariat
Stockholm, Sweden
Authors
Jyl Boline, Michael Hawrylycz, and Robert W. Williams
Email: rwilliam@nb.utmem.edu
Workshop Organizer
Robert W. Williams, University of Tennessee Health Science Center,
Memphis, Tennessee
Workshop Participants
Jan G. Bjaalie, INCF Secretariat, Stockholm, Sweden
Jyl Boline, Laboratory of Neuro Imaging, UCLA School of Medicine, Los Angeles, USA *Reporteur
Gregor Eichele, Max Planck Institute of Biophysical Chemistry, Göttingen, Germany
Francoise Gofflot, Institut Clinique de la Souris (ICS), Strasbourg, France
Teiichi Furuichi, RIKEN Brain Science Institute, Wako, Japan
Michael Hawrylycz, Allen Institute for Brain Science, Seattle, USA
Andreas Hess, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
G. Allan Johnson, Duke University, Durham, USA
Jonathan Nissanov, Drexel University, Philadelphia, USA
George Paxinos, University of New South Wales, Randwick, Australia
Charles Watson, Curtin University of Technology, Bentley, Australia
Ilya Zaslavsky, University of California, San Diego, USA
Invited but unable to attend
Johan Auwerx, Strasbourg, France
Mihail Bota, University of Southern California, Los Angeles, USA
Maryann Martone, National Center for Microscopy and Imaging Research,
University of California, San Diego, USA
Larry Swanson, University of Southern California, Los Angeles, USA
Arthur Toga, Laboratory of Neuro Imaging, UCLA School of Medicine, Los Angeles, USA
Supported by the INCF Central Fund and the Swedish Foundation for Strategic Research
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ ]
Contents
1
Executive Summary
5
2
Introduction
6
3
Concepts
7
4
Workshop Discussions
8
5
Recommendations
9
5.1 Infrastructure For Data Sharing
10
5.2
Large Data Collection Project
13
5.3
Other Potential Proposals
15
6
International Collaborative Atlasing Projects Fostered
by this Workshop
15
7
Future Relevant Workshops
16
Appendix: Workshop program
17
References
18
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ ]
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ ]
1. Executive Summary
The recommendations of this workshop group center around
the mission of INCF "to contribute to the development of scal-
able, portable, and extensible digital applications that can be
used by neuroscience laboratories to further knowledge of the
human brain and related diseases." Our expectation is that
in the near future, research will involve intense interactions
among scientists and computer networks at the level of ideas
and dynamic reusable data. Neuroscientists will place greater
reliance on stable web-enabled knowledgebases and will move
away from rigid legacy models involving static research sum-
maries and one-way non-digital communication. Atlases and
spatial indexes will play a fundamental role in this shifting
research landscape and will evolve into critical resources for
gathering, securing, analyzing, and communicating research.
In the next decade, the standard bearer of scientific progress--
the primary research paper--is likely to transition from a sleek
and static synopsis of results and conclusions to a more com-
plete, dynamic, and re-computable encapsulation of data and
interpretation. This transition is already evident in the fields of
genomics and bioinformatics in which papers are often point-
ers to massive data sets, analytic tools, and other appendix
material. It is likely that a typical rd-millenium neuroscience
communication/publication will include accessible data sets
with full metadata, far more complete details on experimental
design, and an enriched discussion section including interac-
tive content such as a forum for genuine discussion between
the lead authors and a larger community of reviewers and com-
mentators.
Web-accessible brain atlases and spatial indexes will undoubt-
edly be among the most important tools needed to transition
gracefully from our static synopsis mode of publication to a
data-rich, dynamic, multidimensional mode of scientific inter-
action. Atlases will be key query tools as they appeal to our
preference for visual exploration of complex data sets. Thus,
this is a crucial time for an international community of neu-
roscientists to begin to converge on a lingua franca for digital
neuroscience atlases, less with the goal of enforcing confor-
mity, than with the goal of building resources and tools that
translate among existing and future atlasing systems and their
terminologies. This kind of transnational and translational ac-
tivity ideally matches the mission of the INCF in the domain
of digital atlasing.
With these longer-term objectives in mind, the 1
st
INCF Mouse
and Rat Atlasing Workshop was held in Stockholm in February
2007. Our first objective was to survey current activities and
plans related to mouse and rat brain digital atlasing systems
and to produce a broad international inventory of resources and
ongoing efforts. The second objective was to review the range
of techniques that are being used to build, normalize, segment,
and label atlases and to examine what aspects of this techni-
cal work are redundant, compatible, and compliant across
platforms. The third objective was to forge an international
network to foster increased collaboration and interoperability
across national, linguistic, and funding barriers and to examine
how to promote international collaborations in the future.
The final and most important objective of this workshop was
to improve the impact of atlasing projects in the near term (
years) while reducing costs and redundancy of these efforts.
Funds spent in support of the INCF secretariat and the national
nodes should ultimately be leveraged many times over. Atlas-
ing efforts in each member country should be able to accom-
plish more as part of a coordinated INCF activity than they
would in isolation. To attain this goal, the INCF secretariat
and the supporting nodes strongly encourage integration, code
reuse, and joint projects.
Researcher goals and platforms differ, and all participants at
this workshop accept, and even encourage, a wide variety of
approaches to brain atlasing. However, all participants also ap-
preciate the enormous benefits that may be gained by combin-
ing and integrating across diverse resources.
Consensus and Agreement
·
There is a growing need for international digital atlas
based neuroinformatics infrastructure as an organizing
entity for data sharing and relating previously disparate
knowledge bases into a linked system.
·
The INCF is in a unique position to help coordinate the
creation of standards and tools to harvest and maintain
data and metadata.
·
The INCF should encourage open standards for protocol,
tools, and even specific data sets and atlases. There are
now several "open access" images of the mouse brain
(Allan Brain Atlas, Franklin and Watson, and NeuroTer-
rain). However, corresponding "open access" D seg-
mentation data sets are still largely unavailable. Efforts
by groups such as those at Drexel (Nissanov and col-
leagues) and UCLA (Toga and colleagues) are leading
the way to open access to D atlases.
·
There is a need for much more accessible data on con-
nections among CNS regions and neurons incorporated
into atlases to enrich functional network analysis (prefer-
ably open data sets of connections).
·
Infrastructure and tools developed should be open, gen-
erally available, and easy to use.
·
Any infrastructure work in this area depends heavily on
database persistence and sustainability and ontologies to
describe relevant data. Any work in this area will be in-
fluenced by the outcome of these upcoming INCF work-
shops.
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ ]
2. Introduction
Publications are currently the primary way in which research-
ers share their findings, data, and metadata with the scientific
community. In the past, the data from these experiments have
rarely been used again and other researchers have often found
it difficult to duplicate an experiment. More recently, there has
been a push to share the original data, metadata, and even the
processed and analyzed data with the rest of the scientific com-
munity. The added value and multiple applications of this are
obvious: evaluation of the initial experiment by other investi-
gators, the use of already collected data in other experiments,
and new meta-analyses. This gives researchers the ability to
examine issues in a way no single laboratory would have the
resources for and to potentially answer questions or even pose
new questions previously impossible.
Sharing data on a large scale with minimal interactions between
the data provider and the data consumer creates the challenge
of how to do this in a usable and meaningful manner. While it
may be relatively easy to reduce the barrier to sharing, it is dif-
ficult to do so to a degree that still facilitates the process of data
rediscovery and use for other purposes. Data repositories fol-
lowing this model have often turned into effective graveyards
for data. Therefore, data discovery and data sharing must have
associated information that will allow a scientist to concentrate
on creating interesting hypothesis, not on the details of gather-
ing the data and tracking down the methods of data collection.
This may make it more difficult for the data provider, but it is
better long-term insurance that their data may be used again
for other purposes.
Much value can be derived from a data sharing system that
finds and integrates data of many different types and from dif-
ferent sources. For this reason, digital atlases have been iden-
tified as potential neuroinformatics frameworks, as they act as
a "map" one may use to traverse the brain and associated data
of different scales, modalities, and sources. However, if we
wish to have an atlas that acts as such a neuroinformatics hub,
it must be more than a map. It must act as a gateway to large
distributed databases of images, volumes, segmentations, and
other types of spatially-registered data. These databases must
be connected through spatial data registries and services, as
well as standard APIs and vocabularies to make them all work
together. Contributing data within this framework requires
tools for registration, image segmentation, spatial selection,
and analysis. Finally, accessing this system requires viewers
and annotation tools with authenticated access. In summary,
there must be a full and complex infrastructure built behind the
atlas as well as intuitive interfaces for interacting with them.
Several investigators have worked on various aspects of this is-
sue, but it is not usually in an individual biological researchers'
best interest to pursue creating an infrastructure that is exten-
sible to others at the neglect of their own unique research. As
of yet, there remains a gap in all the potential resources and a
full extensible sharing infrastructure in this format. Domain
experts in this field would gladly participate in creating such
an infrastructure, but most do not have the technical resources
to build it on their own. What is needed to create such an
infrastructure is an organizing body that can survey current
practices, help generate standards, and aid in the technical con-
struction of this sort of sharing infrastructure.
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ ]
3. Concepts
To frame the discussion and establish the highest precision of
intent it is first useful to establish concrete terminology.
Digital Atlas
An atlas is a collection of maps or manifolds, traditionally
bound into book form, but also found in multimedia formats
(Wikipedia (3/20/07). As technology has advanced so have
brain atlases transformed from passive paper guides to dynamic
databases at the core of software applications (Toga 00). In
this report, we almost exclusively focus on these sophisticated
digital atlases held in either free-standing software tools or
web-enabled hyperlinked neuroinformatics hubs that act as a
gateway to a collection of databases, metadata catalogs, and
related multimedia documents and annotations that are placed
in a common spatial framework and thus can be juxtaposed
and analyzed together.
Data Repository
A central place where data is stored and maintained (Wikipedia
04/07)
Database
A database can be defined as a structured collection of records
or data that is stored in a computer so that a program can con-
sult it to answer queries and incorporates software to make it
accessible in a variety of ways. The records retrieved in an-
swer to queries become information that can be used to make
decisions. (Adapted from Wikipedia 04/26/07 and the Oxford
English Dictionary)
Spatial Database
"We propose a definition of a spatial database system as a data-
base system that offers spatial data types in its data model and
query language and supports spatial data types in its imple-
mentation, providing at least spatial indexing and spatial join
methods."
(Ralf Gting,
http://www.informatik.fernunihagen.de/import/
pi/papers/IntroSpatialDBMS.pdf
)
Image Registration
Image registration is a process of relating and organizing char-
acteristics of two or more images, so that image data obtained
from different measurements can be discovered, compared or
integrated. Image registration may include organizing image
metadata into an image metadata catalog, as well as placing the
images in a common semantic or spatial framework. This may
be contrasted with Semantic registration that involves verify-
ing image metadata and associated labeled delineations against
established formal ontologies or controlled vocabularies, to
ensure commonality of terms used in image description.
Spatial registration (often also called "image registration")
is the process of modifying spatial characteristics of an im-
age dataset to align it to another image dataset, thus placing
different images into a common coordinate reference frame.
Image registration techniques vary across domains. For ex-
ample, different MR images may be spatially co-registered by
tuning image metadata which includes pixel size, dimensions
and orientation of the image. Alternately, an image may be
transformed into alignment with another image by specifying
pairs of fiducial control points, or by relating the image to a set
of anatomic feature delineations. Technically, spatial registra-
tion procedures may involve a linear alignment or a nonlinear
alignment, which actually warps the images into a common
space. Spatial registration ensures that images may be discov-
ered and queried by spatial coordinates, via an anatomic atlas.
(IZ 05/07)
Spatial Registry or Spatial Registration Services
Spatial registry is a component of image metadata catalog. It
contains information about position, orientation and extent of
registered images, and links image spatial metadata with other
image metadata. A spatial registry is typically organized as a
spatial database: it contains polygonal representations of regis-
tered images, maintains spatial indexing of the image polygons,
and supports spatial queries, e.g. `select images intersecting
with a user-defined shape,' `select images whose centroids are
contained within a user-defined shape,' `select images found
within a 3mm sphere around a user-defined point.' The key
concept is that the spatial registry connects image data with
data annotation enabling effect inquiry. (IZ 04/07, MH 6/07)
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ ]
Annotation
Annotation is the process of associating information content
and knowledge with raw data. Annotations may differ in pur-
pose and complexity, ranging from simple text notes made at
a particular point in a document or in an atlas, to multimedia
composite objects that may include user-defined shapes, docu-
ments, hyperlinks, or other annotations. (IZ 04/07, MH 6/07)
Metadata
Metadata is data about or associated with data used to render a
more precise description or record of its significance. An item
of metadata may describe an individual data item or a collec-
tion of data items and is used to facilitate the understanding,
use and management of data. (Adapted from Wikipedia 04/07,
MH 6/07)
Application Programming Interface (API)
An API is a source code interface that a computer system
or program library provides in order to support requests for
services to be made of it by a computer program. (Wikipedia
04/07)
4. Workshop Discussions
Topics of the workshop included:
·
An inventory and review of the production of digital ro-
dent atlases (short presentations were given by all par-
ticipants), their significance as models, and their diverse
purposes including:
-The display of gene expression and associated im-
age data (Allen Brain Atlas, Cerebellar Development
Transcriptome DataBase, GenePaint, Mouse BIRN
Atlasing Toolkit-MBAT, MousePat, SmartAtlas)
-Anatomic and genetic variation (Neuroterrain and
the Mouse Brain Library)
-Modeling neuronal circuits (Jan Bjaalie, Raphael
Ritz and colleagues)
-Comparative and functional analysis of behavior in
rodents (Andreas Hess)
-The advanced techniques needed to generate high
resolution data as the backbone of these atlases (G.
Allan Johnson, Gregor Eichele)
·
A discussion of registration and mapping techniques used
to align data to these atlases and the necessity for having
atlases serve as a common frame of reference to which
data is linked
·
The applications of digital atlases as gateways to diverse
data types, including gene and protein expression, pat-
terns of neuronal connections, time-series records, popu-
lation surveys, quantitative analyses, and species com-
parisons
·
Access to stable and persistent databases with tools de-
signed to easily upload and share multiple types of data
as well as access and download this data
·
The importance of defined and interoperable neuroana-
tomic vocabularies and higher order ontologies (George
Paxinos, Charles Watson) to bridge the gap between dif-
ferent atlases, different species, different data modalities,
scales, and as a consistent usable language between hu-
mans and machines
·
The need for infrastructure, software tools, and techno-
logical expertise to facilitate data sharing
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ ]
The participants of this workshop reviewed the current work in
the area of atlas-based neuroinformatics and developed a set of
recommendations:
·
We reviewed the challenges of developing, applying, and
translating consistent anatomical terminologies. Several
examples were discussed including:
-The challenge of defining basic structures, such as
the amygdala, across atlases and data collection mo-
dalities. While the standard Paxinos and Swanson
terminologies may be translated at a D stereotaxic
level in adult rodents, how do we handle situations in
which an experiment dictates that an area considered
"amygdala" actually includes adjoining regions?
-To what extent can homologies between CNS struc-
tures be assigned and defined beyond mouse and rat;
mouse and human; mouse and chicken; mouse and
zebrafish?
-How best can terminologies (and atlases) handle the
complexity of development?
·
We reviewed experimental complexities and advances in
many diverse areas of data collection:
-New magnetic resonance histology methods
-Function magnetic resonance methods applied to ro-
dents
-High throughput transcriptome and proteome meth-
ods in both adults and embryos
·
We reviewed the values of atlases in examining experi-
mental data and for tying different types of data backed
by databases into a meaningful context including mi-
croscopy and electron microscopy data, other types of
D volumes, surfaces, neural connections, blood vessels,
gene expression levels experimental information
·
We reviewed the power and problems of mapping data to
digital atlases, an effect exacerbated by the lack of stan-
dardization
5. Recommendations
Through presentations, discussions, and follow up, this work-
ing group came up with two major recommendations for IN-
CF's role and participation toward achieving the goals outlined
above. These are listed below and followed by a more com-
plete description.
1.Encourage standardization and reproducibility in the
development, maintenance and use of digital atlases by
providing and maintaining a framework and/or infra-
structure for digital atlas data sharing.
.Serve in a guidance and organizational role for larger
collaborative mouse and rat projects.
In addition, this group also made the following more general
recommendations, which may be implemented in conjunction
with or separate from the two above.
·
Promoting International Recommendations and Stan-
dards (following examples of successful international
standards bodies such as Open Geospatial Consortium
http://www.opengeospatial.org/
)
-Make data sharing, standard API, and atlasing soft-
ware tool recommendations standard practice for
code robustness, support for multiple interfaces, and
better usability.
-Best practices for managing and evolving the infra-
structure in an extensible and scalable manner.
-Gather and disseminate best practices learned across
fields in a digestible format.
·
Resources for Domain Experts:
-Generate and garner financial resources for interna-
tional nodes to build infrastructure and tools.
-Aid in recruiting competent technical help for groups
that wish to share their resources.
·
Facilitate collaboration:
-Facilitate communication between tool builders and
users.
-Identify groups for collaboration in areas of ex-
panse.
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ 10 ]
-Identify groups for collaboration in other but related
fields.
-Identify relevant technologies from other fields and
parties that may aid in transferring the technology.
-Provide connectivity with people to help facilitate
this collaboration.
5.1
Infrastructure for Data Sharing
The workshop committee recommends that the INCF support a
vision of centralized standardization and infrastructure related
to the development and maintenance of digital rat and mouse
atlases. Rodent models are ideal for developing this infrastruc-
ture, as they are model organisms for human disorders, geneti-
cally defined, and may yield a wealth of diverse data types.
Also, there are many existing atlases from which to generalize
a standard. The ability to pull together these types of informa-
tion both within and across species, greatly facilitates our abil-
ity to understand disease mechanisms and potential therapies.
As the infrastructure for sharing is developed, it is expected
that it will be extensible to other species and data types.
This proposed environment should facilitate data exchange,
content discovery, and open interchange with existing data sets
and formats. This recommendation would take the form of a
potential set of standards for data sets and methods related to
digital atlasing including:
·
Ontological and format proposals for a set of archival
mouse and rat related neuroinformatics datasets with re-
lated metadata. These are the "canonical" datasets that
provide a state of the art of rodent informatics imaging
and methods.
·
Description of the access methods for registration, map-
ping, and testing of the canonical datasets with a new test
dataset.
·
Use of a controlled open source environment with re-
spect to archiving, access, and updating of the canonical
datasets and methods.
·
The ability to propose or upload other datasets for con-
sideration in the canonical set.
·
Reporting on performance and testing of the archival ca-
nonical datasets.
5.1.1 Identification of Canonical Datasets
The set of atlases used as the backbone of this project would
likely include a set of digital atlases currently used by the com-
munity or any new high-quality datasets generated with this
purpose in mind. These core datasets provide a new INCF
standard for mapping and registration, whereas transforms en-
abling mapping of existing atlases into this new standard fa-
cilitate connectivity with legacy atlases and applications. The
standard serves as the benchmark for future mapping, registra-
tion, and annotation efforts. Recommendations by this group
for the initial set of canonical datasets and atlases:
Species/strain:
·
Inbred strains of mice: primary = CBL/J, secondary
strains include 1S1/SvImJ, DBA/J or other of the 1
sequenced strains.
·
Inbred rat strains including the sequenced Brown Nor-
way BN strain and/or more widely used strains such as
albino Sprague-Dawley.
Datasets: Provide the fundamental image datasets to which
subsequent image registration and annotation would be refer-
enced. Ideally this set should include:
·
A multiple scan averaged high contrast D MRI volume
(at least 0 micron resolution).
·
A histology dataset, most likely a Nissl stain (at least 0
micron isotropic resolution, 1 micron in plane) that is re-
constructed into a D image volume.
·
Angiographic datasets would be collected in order to map
the vascular system of the brain. This would include:
-A MR angiographic dataset (at least 100 micron iso-
tropic).
-A µCT corrosion cast to visualize vessels down to
10-0 micrometers in diameter.
The registered datasets would be used as the default standard
for digital atlas applications and their comparison. Preferably,
these would be from the same animal with an in-situ µCT of
the skull for the lambda and bregma references.
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ 11 ]
Age: Mouse within the range of 8-10 weeks; Rat within the
range of -1 weeks approximately 0 g.
·
Sex: 1 male and 1 female of each species/strain, dataset,
and age.
·
Fully delineated in a manner that enables the translation
of known anatomy to significant structural depth (500+
structures). This should include registration mapping
transforms from other widely used annotated atlases.
The inclusion of other types of canonical datasets over time
might include:
·
Other species and strains
·
Different ages including embryonic stages
·
Other annotations
·
Marker genes
·
Neural connections
·
the spinal cord and the eye
Also the group showed a great desire to coordinate the assem-
bly of a new annotated canonical atlas based on the same ani-
mal scanned from CT (for identification of Bregma), MRI, and
histology.
There should be a curation process and methodology for bring-
ing in any new canonical atlases established by the INCF that
includes domain experts. Ideally, as infrastructure is built,
tools would be offered that make it easier for people to com-
ment on existing annotation in these atlases, or add their own.
5.1.2 Linking of Canonical Atlases
In order to use digital rodent atlases as the backbone for a neu-
roinformatics framework they need to be linked in a manner
that will allow tools to navigate across different atlases and
the data associated with them. The following are suggested
methods for connecting these data:
Semantic mapping
·
A consistent anatomic structure naming convention
should be adopted, which includes both structure names
and abbreviations (as GO has done for gene names). This
has been a significant issue in the development of neuro-
anatomic atlases to date.
·
Structure names/Ontologies need to be mapped across at-
lases so it is possible to cross different labeling conven-
tions.
·
There needs to be a solid mapping between the rat and
the mouse. It is likely this mapping will be semantic first,
and spatial over time.
·
It was suggested that as the group of canonical atlases are
built to look for common features across atlases. These
would be used as the core reference atlas, a set of unam-
biguously defined features and ontology that most scien-
tists would agree with regardless of their preferences for
how the brain should be delineated or named. These may
be used to cross species.
Spatial mapping
·
Registered to the same space, or explicitly defined co-
ordinate framework (such as stereotaxic coordinates)
which may also be used to define the framework for new
data
·
Spatially linked using an infrastructure that allows tools
to translate from one atlas to another atlas
5.1.3 Web-accessible Platform for Access to the
Canonical Atlases
The proposed platform would provide a framework and bench-
mark for the evaluation of new digital atlases, datasets, and
methods. Offering maintained and refereed open source datas-
ets and registration methods to these atlases would allow users
to:
·
Test and map novel mouse and rat datasets against ca-
nonical atlases of their choosing
·
Improve existing and implement novel algorithms for
mapping and translation between data sets
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ 1 ]
Sharing data within this framework requires data providers to
upload their data using semantic and/or spatial mapping and
may include whole brain, "chunks" of the brain, and several
different data types. Thus, simple and accessible methods to
enable this mapping must be offered. A specific architectural
review of this proposed site and process should be performed
to ensure optimal design and implementation strategies as well
as its relationship to existing repositories/archives and datasets.
The primary role of the INCF in this regard would be to set the
policy and review recommendations for contributed material.
Specific recommendations by this working group include:
·
This platform would provide a test bed and archival stor-
age system for registration methods against the canonical
datasets.
·
An architecture and site design review should be held for
the housing of canonical datasets and mapping require-
ments.
·
INCF would facilitate collection of registration methods
and offer open access to their functionality by working
with domain experts. Ideally, this will be offered through
a simple web-accessible platform that helps step a user
through the registration process in as close to an auto-
matic process as is possible and practical. At the end of
this process, the user must have the ability to automati-
cally "publish" their dataset.
·
Even with help, this group understands that registration
is still a complex issue, and it may be necessary for INCF
to assist people with registration (even if on a fee-based
basis).
·
There should be a standard of usability and a curation
process and methodology for bringing in any new regis-
tration processes into the infrastructure that includes im-
age registration domain experts.
5.1.4 Queryable Interfaces
It is vital to this project that data uploaded in this framework
be stored in a manner that enables easy access, rapid content
discovery, and comparison. It is advisable that the design and
development of this interface be performed in concert with the
atlas data upload aspect of the project.
·
The databases (either new or existing-see section below)
housing either the atlases or the newly uploaded data
should supply data in a format of a common data model
based call-service signature or syntax used for query call
for that data type (API and/or web-services).
·
Standard APIs will be needed for three levels of interop-
erability:
-To retrieve different types of data from atlas servers
-To query atlas metadata catalogs
-To exchange information between atlas applications
·
These common data model-based APIs and web-services
should be defined by the experts storing and using that
particular type of data (i.e. MAGE for microarray gene
expression data, see references below). Since there isn't
a standard that has yet been adopted by that community,
it is recommended that a global working group be created
to develop one for that data type. This should include
neuroscientists, database developers, query tool builders,
and people with experience with developing this in other
fields.
·
Ideally the INCF would also provide a simple visual aid
for web-query of this data. More specifics for the design
of the query tools will also depend on the outcome of the
survey of current software tools and practices.
·
The issue of allowing users to create their own atlas de-
lineations/annotations is complex and should be consid-
ered, although this would have to be implemented in a
controlled manner. A primary goal of the infrastructure
is to allow researchers flexibility while still operating
within a set of standards that facilitate collaboration and
sharing.
5.1.5 Existing Databases
There are a wide variety of related and supporting databases
and information sources that are presently available and perti-
nent to this effort. Section .1. discusses how common data
model based APIs and web-services will be key tools for op-
erators of databases to may make their own resources avail-
able to the community query tools. Over the long-term, with
the development of the appropriate resources to facilitate data
sharing, different data modalities may be shared through this
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ 1 ]
environment such as electrophysiology, cell imaging, fMRI
and disease phenotypes. To support these data sharing efforts,
there must be tools, methods, and tutorials offered to the public
that allow existing databases to link to this infrastructure.
The INCF should play a key role in this process. As these
standards and methods are developed, INCF should make rec-
ommendations, tools, and tutorials available for how someone
may be able to share their database with this infrastructure in a
manner so that it is accessible for query and access.
Appropriate technical and domain people sponsored by the
INCF may work with individual laboratories to implement the
appropriate steps needed to share their data within this frame-
work.
5.1.6 Software Tools
This proposed infrastructure requires a toolset that facilitates
data upload, data registration, data query, and interoperable re-
sources. As there are already many tools available in the com-
munity in these areas it is possible that tool providers will be
interested in participating in this infrastructure, and that much
existing infrastructure can be leveraged. An INCF approved
study group would develop a plan for this framework based on
a survey of what is currently being developed in this and other
relevant fields. It is also essential that software developed for
this infrastructure be open source and accessible for others to
use and modify for their own needs.
5.1.7 Additional INCF Roles
While this is an ambitious project, this working group sees
some very unique roles the INCF may play in facilitating the
development of this infrastructure:
·
Information gathering:
-Survey neurobiologists who would use an infrastruc-
ture such as this about what data types should be the
first areas of focus and what queries and integration
scenarios they envision
-Survey of available atlases and potential role they
may fit in this architecture
-Based on these surveys, determine the canonical data
types and atlases and possibly "canonical" databases
or data sources
-Potential survey methods include examination of
current resources and users per week, citation num-
ber, google scholar benchmarks, mailing list feed-
back, informal polling (asking experts and users etc.
and propose alternatives)
·
Oversee working groups involved in:
-Developing standards for data sharing in the areas
outlined above
-Drawing up a technical specifications document that
outlines how to implement this infrastructure and
process
-Work with international groups to determine stan-
dards for facilitation of integration and implementa-
tion
·
Develop database resources and ensure longevity and ar-
chival integrity of these resources
·
Provide resources to aid in programming within the in-
ternational nodes
5.2
Large Data Collection Project
Another role that the INCF is uniquely poised to play is as the
parent organization for large data collection projects similar
to HUGO's role for the human genome project. This might
include communication with funding sources, lobbying, and
potentially housing of the database. The INCF would act as a
critical guide and overseer for large scale high resource proj-
ects for which the value to neurobiology is high but for which
there are concomitant risks.
As a project develops, the INCF could orchestrate planning
groups that would develop concrete road maps and milestone
checkpoints. As components of the project are identified,
INCF would aid in recruitment of the expertise needed to fill
the various roles in the project.
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ 1 ]
5.2.1 Example of a Representative Project
Recommendation for a Spatial Atlas of Gene Expression in
the Rat Nervous System (as outlined by Gregor Eichele)
The rat is a widely used mammalian model system often used
to study a broad spectrum of physiological, neurophysiologic
and behavioral questions. However, much of this research is
being adapted to the mouse, chiefly because of the relative ease
by which genes can be manipulated in the mouse. Much ef-
fort goes into adapting a wide spectrum of assays from larger
mammals to the mouse, but there are significant natural limi-
tations to this such as tissue amounts, continuous monitoring
of physiological parameters, and an intrinsic limitation of the
mouse as a "smart" species. Instead of retooling the highly
developed field of physiology, it seems more practical and less
costly to re-determine the comparable genomic information in
the rat. A first step has been completed with the recent comple-
tion of the rat genome sequencing project. Other techniques
such as RNAi knock-down may make species such as the rat
genetically tractable. Here we propose the production of a gene
expression atlas of the rat nervous system.
The current project proposal is for a novel high resolution
spatially mapped rat genome in situ hybridization atlas. The
availability of key data modalities, such as the wealth of elec-
trophysiological measurements in the rat, makes that organism
the natural choice for a project that synergizes the connection
of genetics, anatomy, and electrophysiology. In particular, the
wealth of cell type specific electrophysiology in the rat such as
the characteristics of ion channels will need to be coupled with
gene expression data to untangle the computational complex-
ity of the brain. Based on existing neurobiological resources,
the genes of the rat genome can be prioritized so as to enable
maximal coverage of significant expression patterns and cell
types. Recent developments in neuroinformatics can be uti-
lized to obtain maximal spatial resolution and image mapping
accuracy.
5.2.2 Advantages of this Project
1.This is chiefly an engineering project that can be subdi-
vided in clearly identifiable deliverables and milestones.
.It is a highly interdisciplinary project covering experi-
mental biology, high throughput lab methods, imaging
and image processing, database technology and anatomi-
cal annotation.
.Many of the goals of this project have been attained or
addressed in a preliminary way in the mouse via the Al-
len Brain Atlas (ABA).
.Several novel experimental and computational strategies
not available at the time when the ABA was generated
can be implemented.
5.There would be enormous benefit if this could become
a multinational project similar to the human genome se-
quencing project.
6.Cost and time frame are predictable and finite.
.The project uses in part existing resources but new ones
could be added (e.g. a new interactive web database
housing the expression patterns).
.Genes can be prioritized by importance based on what is
known from the ABA.
9.The price tag is in the range of 50 million .
5.2.3 Implementation
In order to implement this project, planning groups represent-
ing the relevant components of the scientific community would
need to be established. These groups would then draft the re-
quired concept papers and road maps. The INCF could poten-
tially help in the organization and recruitment of appropriate
parties. Example working groups include:
·
Rat genomics experts to annotate genes and design tem-
plates for riboprobes
·
Technology for nd generation tissue sectioning enabling
improved section registration (e.g. high-throughput block
face imaging)
·
Logistics of tissue collection (stages of development,
strain)
·
Logistics large-scale template and probe generation
·
Data collection (in situ hybridization and image capture,
multiplexing for genes allowing detection of multiple
transcripts per tissue section)
Nature Precedings : doi:10.1038/npre.2007.1046.1 : Posted 19 Sep 2007
background image
[ 1 ]
·
Data transfer mechanisms, data processing and data-
bases
·
Computer-assisted annotation of expression by experts
5.2.4 Timeline
·
00: Set up expert teams that will develop key points in
sufficient detail (technically as well as financially) and
generate by the end of 00 a "road map." Have identi-
fied potential contributors as well as funding sources.
·
00/00: Development phase for tools and adaptation
of tools. This requires some pilot funding but is not yet a
data production phase.
·
010-01: Data production phase, with establishment
of database and annotation. This would be the more ex-
pensive part of project. Gene prioritization may reduce
costs.
5.3
Other Potential Proposals