PlantProm: Plant Promoter Database
A Database of Plant Promoter Sequences
PlantProm DB was initially developed as an annotated, non-redundant collection of proximal promoter sequences for RNA polymerase II with experimentally determined transcription start site(s), TSS, from various plant species. The first release of DB, 2002.01, developed by the Department of Computer Science at Royal Holloway, University of London, in collaboration with Softberry Inc. (USA), is available at http://mendel.cs.rhul.ac.uk/mendel.php and http://www.softberry.com. It contained 305 entries from monocot, dicot and other plants.
The new release of PlantProm DB contains 576 unrelated entries including 150, 403 and 23 promoterswith experimentally verified TSS from monocot, dicot and other plants, respectively. In comparison with promoter sets, where TSSs, identified by applying full-length cDNA/5;-5ESTs, CAGE and SAGE approaches, remain to be confirmed by direct experimental evidence, this DB and The Eukaryotic Promoter Database (134 unrelated plant promoters; see: http://www.epd.isb-sib.ch/) present the published promoter sequences with TSS(s) determined by direct experimental approaches and therefore serve as the most accurate sources for development of computational promoter prediction tools (for example, see: TSSP-TCM, TSSP, FPROM, CONPRO).
For collecting experimentally verified plant gene promoters the following criteria was followed.
- There is experimental evidence of the TSS position(s) of the gene, published in the literature. For genes with multiple TSSs the nearest to the CDS start position is taken, if no additional information on the predominance of one of them is available (positions of other TSSs are given in the name line of the sequence written in the FASTA format.
- The length of known promoter sequence upstream of chosen TSS is 200 bp or more; all stored promoter sequences are the same length, 251 bp, where the position 201 corresponds to the TSS, i.e. collected sequences occupy the region [-200 : +51], with the TSS in the position +1, and, thus, present proximal promoters mentioned above.
- An entry corresponds to the gene mapped on the genomic sequences.
- Various alleles of a gene are presented in the database by a single entry.
- Genes with more than one non-allelic copy in the genome as well as paralogous genes are taken as different entries.
PlantProm DB provides the following information.
|1||DNA sequence of 576 experimentally verified (annotated) promoter regions [-200:+51], with annotated or mapped TSS on the fixed position +201, from various plant species, in the FASTA format, including:|
|1.1||150 annotated promoters of monocots,|
|1.2||403 annotated promoters of dicots,|
|1.3||23 annotated promoters from other plants,|
|1.4||345 annotated TATA promoters, consisting of 84 monocot, 256 dicot and 5 other, plant species sequences, respectively (with location of TATA-box core-motifs given in capital letters).|
|1.5||231 annotated TATA-less promoters, consisting of 66 monocot, 147 dicot and 18 other plant species sequences, respectively.|
|1.6||Dicot TATA promoters with putative INR-motif, DPE-motif and YP-patch.|
|1.7||Dicot TATA-less promoters with putative INR-motif, DPE-motif and YP-patch.|
|1.8||Monocot TATA promoters with putative INR-motif, DPE-motif and YP-patch.|
|1.9||Monocot TATA-less promoters with putative INR-motif, DPE-motif and YP-patch.|
|2||Taxonomic and promoter type classification of promoters, including:|
|2.1||List of species represented in the PlantProm DB,|
|2.2||List of genes/gene products and promoter types represented in the PlantProm DB.|
|3||Nucleotide Frequency Matrices for canonical promoter elements (TATA-box, INR-motif, DPE and Y-patch, including:|
|3.1||TATA-matrices for various promoter collections,|
|3.2||INR-motif matrices for various promoter collections,|
|3.3||DPE-matrices for various promoter collections,|
|3.4||Y-patch matrices for various promoter collections.|
|4||Short description of the computation of nucleotide frequency matrices for various promoter elements.|
For available accession numbers in database refer List
Paste FASTA sequence or upload FASTA file for BLAST2>
Ilham A. Shahmuradov, Alex J. Gammerman, John M. Hancock, Peter M. Bramley and Victor V. Solovyev (2003) PlantProm: a database of plant promoter sequences. Nucleic Acids Res., 31, D114-D117.
|Acknowledgements: PlantProm Database is partially funded by Pakistan HEC Startup Grant entitled Setting up of Bioinformatics Research at the Department of Biosciences, COMSATS Institute of Information Technology and is designed and maintained at COMSATS Institute of Information Technology (Islamabad, Pakistan), in collaboration with Softberry Inc. www.softberry.com (mirror site) (USA).|
|Questions/comments send to: Ilham Shahmuradov email@example.com and/or Victor Solovyev firstname.lastname@example.org|