PI: Xing Wang Deng,
Plants to be studied:
Rice (Oryza sativa L. ssp. japonica cv. Nipponbare), which was used for genome sequence analysis by the International Rice Genome Sequencing Project.
Project objectives:
1: Development of a workflow using Maskless Array Synthesizer (MAS) produced high-density oligonucleotide tiling microarrays of the completely sequenced japonica rice chromosome 10. This objective has been achieved at the end of the first funding year.
2: Use MAS-produced high-density oligonucleotide tiling arrays for whole genome transcription analysis. Through close coordination with the NSF-funded TIGR rice genome annotation effort (PI, R. Buell), we will develop an integrated whole-genome transcription unit map by combining our tiling microarray analysis with all available SAGE, MPSS, EST and full-length cDNA data. The data will be made public via web accessible databases and a public depository as soon as they become available and have been quality controlled. During the last funding year, the chromosome 10 data was already incorporated into the TIGR rice genome annotation database and was available to the public. We are well into our whole genome analysis.
3: Training of student and postdocs in microarray design, array hybridization and associated computational analysis, genome annotation and transcription profiling. Training of undergraduate students recruited from under-represented groups will be especially emphasized.
4: Educational outreach to primary and secondary students and the general public in New Haven and other areas of Connecticut through exhibits and presentations at the Peabody Museum at Yale University. We welcomed two high school students Erica Che and Jamila Searchwell in August 2005.
Experimental approaches:
High-density MAS oligonucleotide arrays tiling the rice genome will be used. Statistical and bioinformatic tools suitable for analyzing the rice tiling array data will be developed. The workflow was optimized first with rice chromosome 10 using tiling arrays at a resolution of about 46 bp. Pooled cDNA targets derived from representative RNA samples were used to maximize transcription unit representation. In the second funding year, we applied the 46-bp-resolution tiling microarray analysis to all remaining chromosomes in the rice genome. The data will be integrated into the current and ongoing genome annotation to test predicted gene models and to define structures of the novel transcription units. In addition, we developed a new array (and analytic tools), which includes 5 semi-optimized probes surrogating each annotated gene model and new transcription unit identified from our genome tiling analysis.
Information/Materials to be generated:
The initial analysis of chromosome 10 (Objective 1) detected expression of approximately three quarters of the gene models without previous experimental evidence. Cloning and sequence analysis of the previously unsupported models suggest that the predicted gene structure of nearly half of those models needs improvement. Coupled with comparative gene model mapping, the tiling microarray analysis identified 549 new models for the chromosome, representing an 18% increase in the annotated protein-coding capacity. Furthermore, an asymmetric distribution of genome elements along the chromosome was found that coincides with the cytological definition of the heterochromatin and euchromatin domains. The heterochromatin domain appears to associate with distinct chromosome level transcriptional activities under normal and stress conditions. This part of study has been published and relevant data has been deposited in public database (GSE2500) and in TIGR web site.
During the second funding year, we also generated a whole genome transcription map of the indica rice subspecies through collaborations with several research groups in
In the second funding year, we expanded our analysis to the whole japonica genome. All the original data, including raw hybridization data as well as gene models, will be made available to the public through a project web site (http://plantgenomics.biology.yale.edu) once they are quality controlled and verified. This web site is online at the end of the first funding year (August 30, 2005). The full data set will be made available to TIGR and the public by the end of this project (August, 2007).