What’s the input dataset for testing NMPP?
The input dataset (30 arrays and 390,000 probes on each array) for testing NMPP was produced from a self-design NimbleGen microarray that were hybridized to cy3 labeled cDNA samples derived from 10 different rice major tissues and developmental stages, which was also a part of rice tiling array project. This microarray data provides the first comprehensive development map and expression pattern in rice, and reveal the relationship between annotated protein-coding genes and the intergenic transcriptionally active regions (TAR) detected by rice tiling experiments.
Probe design procedure for Rice gene & TAR microarray.
The gene model dataset was from TIGRv3 rice genome annotation, and after removing the transposable elements related genes and multiple alternative splicing iso-forms, a total of 44,385 protein coding genes and an additional set of 25,313 novel TARs were represented on the array. The probe design procedure followed a standard selection criterion, by the software OligoArray 2.1 as listed below:
- Oligo length: 36
- Maximum distance accepted between the 5' end of the oligo and the 3' end of the input sequence: 2000
- Minimum oligonucleotide Tm: 80
- Maximum oligonucleotide Tm: 95
- Temperature to use during secondary structure prediction (an oligo will be rejected if it can fold into a stable secondary structure at this temperature): 65
- Threshold to report putative cross-hybridizations (all targets hybridizing with a given oligo with a Tm exceeding this threshold are reported): 70
- Minimum oligonucleotide GC content: 40
- Maximum oligonucleotide GC content: 73 for gene models; 77 for TARs
- List of prohibited sequences to mask in the input sequence: Runs of five or more consecutive single nucleotides
- Minimum distance between the 5' end of two adjacent oligos: 1.
We also selected 10,000 random probes in the genome sequences, and 9000 probes within the introns of PASA validated full length cDNA gene models, as a negative control set to construct a global background noise distribution.
Hybridization experiment design
All probes were synthesized into a single array and hybridized in triplicates to cDNA target derived from 10 rice tissue types, namely, light-grown seedling, dark-grown seedling, seedling root, leaf, Xoo-infected leaf, flag leaf, whole flower (floret), carpel, developing seed, and suspension cultured cell.

|