PlantProm: Plant Promoter Database

 

PlantProm DB

A Database of Plant Promoter Sequences

Release 2010.02

PlantProm DB was initially developed as an annotated, non-redundant collection of proximal promoter sequences for RNA polymerase II with experimentally determined transcription start site(s), TSS, from various plant species. The first release of DB, 2002.01, developed by the Department of Computer Science at Royal Holloway, University of London, in collaboration with Softberry Inc. (USA), is available at http://mendel.cs.rhul.ac.uk/mendel.php and http://www.softberry.com. It contained 305 entries from monocot, dicot and other plants.


The new release of PlantProm DB contains 576 unrelated entries including 150, 403 and 23 promoterswith experimentally verified TSS from monocot, dicot and other plants, respectively. In comparison with promoter sets, where TSSs, identified by applying full-length cDNA/5;-5ESTs, CAGE and SAGE approaches, remain to be confirmed by direct experimental evidence, this DB and The Eukaryotic Promoter Database (134 unrelated plant promoters; see: http://www.epd.isb-sib.ch/) present the published promoter sequences with TSS(s) determined by direct experimental approaches and therefore serve as the most accurate sources for development of computational promoter prediction tools (for example, see: TSSP-TCM, TSSP, FPROM, CONPRO).


For collecting experimentally verified plant gene promoters the following criteria was followed.

  • There is experimental evidence of the TSS position(s) of the gene, published in the literature. For genes with multiple TSSs the nearest to the CDS start position is taken, if no additional information on the predominance of one of them is available (positions of other TSSs are given in the name line of the sequence written in the FASTA format.
  • The length of known promoter sequence upstream of chosen TSS is 200 bp or more; all stored promoter sequences are the same length, 251 bp, where the position 201 corresponds to the TSS, i.e. collected sequences occupy the region [-200 : +51], with the TSS in the position +1, and, thus, present proximal promoters mentioned above.
  • An entry corresponds to the gene mapped on the genomic sequences.
  • Various alleles of a gene are presented in the database by a single entry.
  • Genes with more than one non-allelic copy in the genome as well as paralogous genes are taken as different entries.


PlantProm DB provides the following information.

1DNA sequence of 576 experimentally verified (annotated) promoter regions [-200:+51], with annotated or mapped TSS on the fixed position +201, from various plant species, in the FASTA format, including:
1.1150 annotated promoters of monocots,
1.2403 annotated promoters of dicots,
1.323 annotated promoters from other plants,
1.4345 annotated TATA promoters, consisting of 84 monocot, 256 dicot and 5 other, plant species sequences, respectively (with location of TATA-box core-motifs given in capital letters).
1.5231 annotated TATA-less promoters, consisting of 66 monocot, 147 dicot and 18 other plant species sequences, respectively.
1.6Dicot TATA promoters with putative INR-motif, DPE-motif and YP-patch.
1.7Dicot TATA-less promoters with putative INR-motif, DPE-motif and YP-patch.
1.8Monocot TATA promoters with putative INR-motif, DPE-motif and YP-patch.
1.9Monocot TATA-less promoters with putative INR-motif, DPE-motif and YP-patch.
2 Taxonomic and promoter type classification of promoters, including:
2.1List of species represented in the PlantProm DB,
2.2List of genes/gene products and promoter types represented in the PlantProm DB.
3Nucleotide Frequency Matrices for canonical promoter elements (TATA-box, INR-motif, DPE and Y-patch, including:
3.1TATA-matrices for various promoter collections,
3.2INR-motif matrices for various promoter collections,
3.3DPE-matrices for various promoter collections,
3.4Y-patch matrices for various promoter collections.
4 Short description of the computation of nucleotide frequency matrices for various promoter elements.

Paste FASTA sequence or upload FASTA file for BLAST

Following files are available for download:
1Total 576 esperimentally verified promoter sequences
1.1150 annotated promoters of monocots,
1.2403 annotated promoters of dicots,
1.323 annotated promoters from other plants,
1.4345 annotated TATA promoters including the following (with location of TATA-box core-motifs given in capital letters):
1.4.184 monocot TATA promoters
1.4.2256 dicot TATA promoters
1.4.35 other species TATA promoters
1.5231 annotated TATA-less promoters consisting of:
1.5.166 monocots TATAless promoters
1.5.2147 dicots TATAless promoters
1.5.318 other species having TATAless promoters
1.6Dicot TATA promoters with putative INR-motif, DPE-motif and YP-patch.
1.7Dicot TATA-less promoters with putative INR-motif, DPE-motif and YP-patch.
1.8Monocot TATA promoters with putative INR-motif, DPE-motif and YP-patch.
1.9Monocot TATA-less promoters with putative INR-motif, DPE-motif and YP-patch.
2 Taxonomic and promoter type classification of promoters, including:
2.1List of species represented in the PlantProm DB,
2.2List of genes/gene products and promoter types represented in the PlantProm DB.
3Nucleotide Frequency Matrices for canonical promoter elements (TATA-box, INR-motif, DPE and Y-patch, including:
3.1TATA-matrices for various promoter collections,
3.2INR-motif matrices for various promoter collections,
3.3DPE-matrices for various promoter collections,
3.4Y-patch matrices for various promoter collctions.
4 Short description of the computation of nucleotide frequency matrices for various promoter elements.

Reference:

Ilham A. Shahmuradov, Alex J. Gammerman, John M. Hancock, Peter M. Bramley and Victor V. Solovyev (2003) PlantProm: a database of plant promoter sequences. Nucleic Acids Res., 31, D114-D117.


Acknowledgements: PlantProm Database is partially funded by Pakistan HEC Startup Grant entitled Setting up of Bioinformatics Research at the Department of Biosciences, COMSATS Institute of Information Technology and is designed and maintained at COMSATS Institute of Information Technology (Islamabad, Pakistan), in collaboration with Softberry Inc. www.softberry.com (mirror site) (USA).
Questions/comments send to: Ilham Shahmuradov ilham@comsats.edu.pk and/or Victor Solovyev victor@softberry.com