[CREATE]

[Add new top story]

[/CREATE] [EDIT][Edit][/EDIT]

Databases and Biobank Information Management Systems

Leaders: Jan-Eric Litton, Steve Walker

Aims and objectives

The long-term aim of this work package is to aid research into the causes of common, complex diseases by arriving at a common strategy that will provide an efficient network of databases for population biobanks and cohorts in Europe. The challenge is to easily combine information from biobanks and cohorts so that very large sample sizes are generated, providing relevant, high quality and correctly documented sets of data to researchers. Current confusion regarding gene-disease associations for a variety of disorders is to a large extent due to the severe lack of statistical power in the empirical studies that are presently being performed. The lack of general information management systems for biobanks is a bottleneck for processing the increasingly complex data structures incurred in modern clinical and epidemiological research. The population cohorts that will participate in this initiative have developed or are in the process of developing diverse technologies and logistics to keep track of study subjects, environmental exposures, phenotypes, biological samples and analytic outcomes such as genotypes and gene expression profiles. The first objective in this work package is to arrive at a consensus on the requirements for a general information management system for biobanks. The second objective is to agree on the unique identities for biobanks as well as for the secure identities for subjects from which samples are taken. However, the basic unit that requires an identity is the single specimen, which again may be further divided such that aliquots can be widely exchanged between laboratories and even be pooled for some analytic purposes. For each aliquot a number of analytic results will be reported, and this information, including meta-information on the validity and use of the laboratory results, must be easily traceable in the databases. Thus, samples and results must be localisable in a hierarchical system based on an agreed-on identity structure. Also, transactions and permissions for use must be built into the information structure. The power of biobank-based research will increase enormously if multiple biobanks are interconnected to enable sharing of information and samples. The third objective is therefore to arrive at a complete strategy for biobank communication including a common information model. The final requirement for biobank communication is a well-defined protocol for transmitting information between biobanks. This will necessarily incorporate a central biobank registry. Each biobank joining the communication network will use well-defined methods to register meta-information about the biobank contents in terms of samples and information. In a later stage, assuming a biobank wants to perform a specific search, it sends a request to the central biobank registry. Based on this request the registry will send a reply with information describing the biobanks containing such information. Finally, the requesting biobank sends the actual query to these biobanks in order to get results. In order to achieve these communication strategies, one may use techniques such as Web services. This is a highly standardised platform based on the SOAP protocol, which in turn is based on the widely-spread XML standard. Web services provide an optimal platform for communication between biobanks since it is entirely built on open standards and not connected to any particular vendor or programming language. Thus, it will enable biobanks with completely different IT infrastructures to communicate with a common protocol.

 

The primary aims of WP3 will be:

To convene an expert group as one part of the Initial PHOEBE Conference. At the conference, this group will consider additional experts that should be invited to be involved in the work, and agree upon a strategy for on-going development of the scientific aims outlined above, within the necessarily limited framework of a Coordination Action

The expert group will then produce a number of reports that will address the key issues underpinning the objectives detailed above: (a) The first task in this work package will be to review to arrive at a consensus on the requirements for a general information management system for biobanks.  (22 months); (b) A second task will be to explore systems for maintaining unique and secure identities for specimens, subjects and biobanks, as well as for keeping track of the handling of permissions for use, analytical results and statistical output. Meta-information on quality of specimens and phenotypes will be integrated.  (3 years) and c) The third task will be to explore a complete strategy for communication between population cohorts including a common nomenclature, compatible software techniques and appropriate information transmission policies. This all relates to information on specimens, laboratory results, phenotypes, exposures and genealogical data (1-2 years). In commencing the work in relation to (b) and (c), it is unclear whether the reports will conclude with a definitive scientific solution to each problem or with a list of further questions that need to be addressed in order to arrive at these solutions. This is why it is entirely appropriate that this work be undertaken within the framework of a CA allowing relevant experts to pool their massed expertise.

The main conclusions of WP3 will be summarised in a final report [bringing together (a), (b) and (c)] which will be presented and discussed at the Concluding Conference on Population Biobanks for Health and then, following appropriate revision, posted on the worldwide web.

 

Deliverables

D 1      Preliminary planning, strategy setting and identification of full expert groups for all work packages at Initial PHOEBE Conference.

           Time: 12 months

D 2      Debriefing, presentation and discussion of final report, and discussion of future strategy for all workpackages at the Concluding PHOEBE Conference.

           Time: 35 months

D 3      Report to be posted on worldwide web.

           Time: 36 months

D 18    Review existing and planned systems, and to arrive at a consensus on the requirements for a general information management system for biobanks.

           Time: 20 months

D 19    Final report which will also address format and variable standards, communications standards and transmission policies. 

           Time: 33 months

 

Milestones

The sole ”decision point” will occur at the Initial PHOEBE Conference when we will determine the full composition of the expert groups. Time: 6 months

M1      Composition of the expert group. Time: 12 months


[CREATE]

[Add new item in list]

[/CREATE]