International Tunicate Research Community Plans Integrated Database System
Posted by Ken Hastings, on 14 April 2011
Meeting Report: First Tunicate Information System Meeting, Nice, France November 11-13 2010
Ken Hastings
Montreal Neurological Institute and Biology Department
McGill University
Approximately 50 scientists, including members of the international tunicate research community and representatives of major bioinformatics databases, gathered in Nice, France, November 11-13, 2010 to consider the future development of tunicate informatics. This meeting, termed the First Tunicate Information System Meeting was the inaugural meeting in what is expected to be a regular series devoted to this subject.
Some meeting participants on the beach at Nice, November 13 2010
Tunicates – a diverse group consisting of ascidians, thaliaceans and larvaceans – are research model organisms that share with vertebrates a common ancestry [1] that is reflected in a common basic chordate body plan [2,3,4]. Compared with vertebrates, development is stereotyped and much more rapid, with far fewer cells [5,6], and is driven by a smaller genome [7,8], all of which make tunicates an ideal experimental system for multiscale molecule-to-cell-to-organism investigation of chordate development [3,9,10,11].
Evolutionarily, tunicates are an extremely successful and diverse group and this diversity is a further asset in their application as research model organisms. In some lineages different species show very similar development and morphology despite having genomes that have diverged greatly at the level of nucleotide sequence [12]. This provides great opportunities for comparative understanding of how genomic information is biologically processed during development. In addition, different tunicate lineages exhibit a remarkable range of lifestyles, adult morphologies, and biological features such as the extreme genome reduction and short lifecycle time of the larvaceans [8], and the amazing regenerative capabilities of the ascidians [13,14], especially the colonial ascidians which can generate identical adult forms either through gametogenesis/fertilization/larval development/metamorphosis, or by asexual direct development by budding or by regeneration from cells of the vascular system [13,15,16]. This great biological diversity promises insight into a wide range of fundamental biological mechanisms, and coupled with the solid platform provided by the great depth of existing molecular, cellular, and gene regulatory data for the intensively studied solitary ascidian Ciona intestinalis, and for the larvacean Oikopleura dioica, makes tunicates a very attractive group in which to develop an integrated database system. Given the recent explosion of capabilities for genome and transcriptome sequencing, now being applied in ongoing projects for several solitary (Halocynthia roretzi, Phallusia mammillata) and colonial (Botryllus schlosseri, Didemnum vexillum) ascidian species, this was the right moment for tunicate researchers, a collegial and interactive world-wide community, to initiate an ambitious plan for a multi-species, multi-class, multi-system, and multi-scale database organization.
The meeting, co-organized by Patrick Lemaire (France), Kazuo Inaba, Yutaka Satou, Toshinori Endo, Kohji Hotta (Japan), and Tony de Tomaso (USA) with funding from AVIESAN and DOPAMINET, drew together tunicate researchers from Europe, North America, Israel, and Japan, and informatics experts from Ensembl UK (Ewan Birney, Fiona Cunningham, Daniel Sobral), DDBJ, NIG Japan (Kazuho Ikeo), NIAIST Japan (Tadashi Imanishi) and Chado USA (Joshua Orvis) (see participant list on Meeting website). In the weeks before the meeting eight working groups carried out email surveys of tunicate researchers to probe their hopes and expectations for a community database. At the meeting these survey results were presented by each working group and discussed in plenary session. In addition, overview presentations of current capabilities and future prospects of the major existing Ciona intestinalis databases were given by Satou (Ghost [17]), Lemaire (ANISEED [11]), Endo and Inaba (CiPro [18]), Hotta (FABA [19]), and Takehiro Kusakabe (DBTGR [20]). Additional presentations by Birney on the relationships of “community” databases to the general databases (e.g. Ensembl), by Cunningham on Ensembl informatics pipelines, by Imanishi on automatic maintenance of hypertext cross-links between databases, and by Orvis on the Chado system of data architecture, provided perspective and insight into the organization and methods of large-scale bioinformatics efforts. Participants broke into roundtable discussion groups to define informatics objectives in the areas of Gene Expression/Transcription, Phenotypes/Anatomy, and Proteins/Cell Biology and discussion outcomes were reported back for further comment in plenary session.
From these proceedings there emerged consensus on standards and priorities on a wide range of issues, and an overall plan for the rationalization and improvement of existing databases in a setting that would foster the emergence of a comprehensive “Tunicate Community Database” into which existing and future data could be functionally integrated (additional details in Meeting summary conclusions). Single-species and themed interest databases currently maintained by individual laboratories were deemed extremely useful and should be further developed going forward. As a first step in their coordinated development, overlap/duplication among the various existing Ciona databases will be reduced. Ghost will retain its focus on the gene and genome and will maintain the principal community genome browser with annotation support from CiPro, CiPro will focus on the level of the cell and its constituents, and ANISEED and FABA will concentrate on multicellularity/development/morphology. Looking forward, it was thought vital that there be uniform standards of data vocabulary and architecture to permit integration with the Tunicate Community Database which will serve as a central access point. The Chado data architecture and a wide range of specific standards were adopted. An additional important aspect of the plan is the development of a single reference data store, a “Tunicate Data Repository” with which the various individual “client” databases would be synchronized to ensure a common set of basic data.
Additional objectives include the comprehensive incorporation of existing gene expression data, and of experimental information regarding cell biology, including experimental protocols, videos, and collections of data on useful tools/reagents, and extending to the published literature, including the difficult-to-access classical literature of past centuries, and also unpublished information, e.g., reports of negative results. Assembly of some of these data, and further discussion of database issues, are to be through Wiki-type community-based input.
In accordance with the general plan for successful community databases outlined by Birney, the Tunicate Community Database would occupy an intermediate level between the individual laboratory databases and the general global databases such as DDBJ, Ensembl, and NCBI. The desire was clearly expressed that the Tunicate Community Database be a real integrator, and not merely a collator, of individual databases. A common data architecture and common controlled vocabularies introduced early can form the basis for joint, rather than parallel, future growth, and can permit true value-added integrative analysis. Such a database system will be ready to serve the expanding need for informatics, and especially integrated informatics, as the number of characterized species increases and the full impact of high-throughput sequence data is felt and exploited.
A philosophical principle governing the future development of the tunicate database system is that it should strive to go beyond the gene-centred approach of current model organism databases and provide an integrated view of development. Because of the stereotyped lineage-based development of tunicates [5, 6,10, 21], the anatomical aspects of development are readily formalized. Such formalization, expressed in terms of controlled-vocabulary ontologies, permits computers to manipulate and understand data on the basis of a semantic web of defined relationships. Ontologies for gene-based molecular data are well-developed and widely used (e.g. Gene Ontology [22]) but this is not yet the case for developmental biology. The stereotyped development of ascidians should permit the creation of highly precise cellular and anatomical ontologies. The simplicity of tunicate development offers the opportunity to create a novel type of integrated database, that may foreshadow future developments in more complex model organisms.
To guide the development of the tunicate database system two committees are being organized. One, a Scientific Steering Committee (SSC), will be composed of representives of the major existing Ciona databases and additional members representing the major tunicate groups – solitary ascidians, colonial ascidians, larvaceans, and possibly thaliaceans, selected by researchers working in each of those communities. This committee will provide overall leadership and will coordinate grant applications to fund the development of the system. In addition, a Scientific Advisory Committee, made up of world-class scientists from outside the tunicate community who are experienced in developing/managing large database systems, would advise the SSC on strategic issues. With such leadership, and the enthusiastic participation of the tunicate research community, this enterprise promises to create a most useful research tool whose development and implementation may provide a model for multi-scale, multi-level integrative informatics in other model organism research communities.
References
1. Delsuc F, Brinkmann H, Chourrout D, Philippe H (2006) Tunicates and not cephalochordates are the closest living relatives of vertebrates. Nature 439: 965-968. Pubmed link
2. Katz MJ (1983) Comparative anatomy of the tunicate tadpole, Ciona intestinalis. Biological Bulletin 164: 1-27. Link to article
3. Satoh N, Satou Y, Davidson B, Levine M (2003) Ciona intestinalis: an emerging model for whole-genome analyses. Trends Genet 19: 376-381. Pubmed link
4. Passamaneck YJ, Di Gregorio A (2005) Ciona intestinalis: chordate development made simple. Dev Dyn 233: 1-19. Pubmed link
5. Satoh N (1994) Developmental biology of ascidians. Cambridge UK, New York USA: Cambridge University Press.
6. Lemaire P (2009) Unfolding a chordate developmental program, one cell at a time: invariant cell lineages, short-range inductions and evolutionary plasticity in ascidians. Dev Biol 332: 48-60. Pubmed link
7. Dehal P, Satou Y, Campbell RK, Chapman J, Degnan B, et al. (2002) The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. Science 298: 2157-2167. Pubmed link
8. Denoeud F, Henriet S, Mungpakdee S, Aury JM, Da Silva C, et al. (2010) Plasticity of animal genome architecture unmasked by rapid evolution of a pelagic tunicate. Science 330: 1381-1385. Pubmed link
9. Imai KS, Levine M, Satoh N, Satou Y (2006) Regulatory blueprint for a chordate embryo. Science 312: 1183-1187. Pubmed link
10. Nishida H (2008) Development of the appendicularian Oikopleura dioica: culture, genome, and cell lineages. Dev Growth Differ 50 Suppl 1: S239-256. Pubmed link
11. Tassy O, Dauga D, Daian F, Sobral D, Robin F, et al. (2010) The ANISEED database: digital representation, formalization, and elucidation of a chordate developmental program. Genome Res 20: 1459-1468. Pubmed link
12. Johnson DS, Davidson B, Brown CD, Smith WC, Sidow A (2004) Noncoding regulatory sequences of Ciona exhibit strong correspondence between evolutionary constraint and functional importance. Genome Res 14: 2448-2456. Pubmed link
13. Voskoboynik A, Simon-Blecher N, Soen Y, Rinkevich B, De Tomaso AW, et al. (2007) Striving for normality: whole body regeneration through a series of abnormal generations. Faseb J 21: 1335-1344. Pubmed link
14. Auger H, Sasakura Y, Joly JS, Jeffery WR (2010) Regeneration of oral siphon pigment organs in the ascidian Ciona intestinalis. Dev Biol 339: 374-389. Pubmed link
15. Ballarin L, Menin A, Tallandini L, Matozzo V, Burighel P, et al. (2008) Haemocytes and blastogenetic cycle in the colonial ascidian Botryllus schlosseri: a matter of life and death. Cell Tissue Res 331: 555-564. Pubmed link
16. Brown FD, Keeling EL, Le AD, Swalla BJ (2009) Whole body regeneration in a colonial ascidian, Botrylloides violaceus. J Exp Zool B Mol Dev Evol 312: 885-900. Pubmed link
17. Satou Y, Takatori N, Fujiwara S, Nishikata T, Saiga H, et al. (2002) Ciona intestinalis cDNA projects: expressed sequence tag analyses and gene expression profiles during embryogenesis. Gene 287: 83-96. Pubmed link
18. Endo T, Ueno K, Yonezawa K, Mineta K, Hotta K, et al. (2011) CIPRO 2.5: Ciona intestinalis protein database, a unique integrated repository of large-scale omics data, bioinformatic analyses and curated annotation, with user rating and reviewing functionality. Nucleic Acids Research 39: D807-D814. Pubmed link
19. Hotta K, Mitsuhara K, Takahashi H, Inaba K, Oka K, et al. (2007) A web-based interactive developmental table for the ascidian Ciona intestinalis, including 3D real-image embryo reconstructions: I. From fertilized egg to hatching larva. Dev Dyn 236: 1790-1805. Pubmed link
20. Sierro N, Kusakabe T, Park KJ, Yamashita R, Kinoshita K, et al. (2006) DBTGR: a database of tunicate promoters and their regulatory elements. Nucleic Acids Res 34: D552-555. Pubmed link
21. Sardet C, Paix A, Prodon F, Dru P, Chenevert J (2007) From oocyte to 16-cell stage: cytoplasmic and cortical reorganizations that pattern the ascidian embryo. Dev Dyn 236: 1716-1731. Pubmed link
22. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25-29. Pubmed link
Soon Zfin and Xenbase will have a sister informatics community. Good news for the ascidian researchers! I hope these organism-specific communities will cross-pollinate ideas and approaches.
Yes Jon the more-developed informatics communities provide a wealth of ideas and approaches to guide our efforts, and hopefully the particular biological features of the tunicates will propel the development of new database functionalities that may become useful across the board.