PlantProm: Plant Promoter Database
PlantProm DB
A Database of Plant Promoter Sequences
(Release 2009.02)
The new release of PlantProm DB contains 578 unrelated entries including 151, 396 and 31 promoters with experimentally verified TSS from monocot, dicot and other plants, respectively.In comparison with promoter sets, where TSSs, identified by applying full-length cDNA/5;-5'ESTs mapping, CAGE and SAGE approaches, remain to be confirmed by direct experimental evidence, this DB and The Eukaryotic Promoter Database (134 unrelated plant promoters; see:http://www.epd.isb-sib.ch/ ) present the published promoter sequences with TSS(s) determined by direct experimental approaches and therefore serve as the most accurate sources for development of computational promoter prediction tools (for example, see: TSSP-TCM, TSSP, FPROM, CONPRO). For collecting experimentally verified plant gene promoters the following criteria was followed.
| • | There is experimental evidence of the TSS position(s) of the gene, published in the literature. For genes with multiple TSSs the nearest to the CDS start position is taken, if no additional information on the predominance of one of them is available (positions of other TSSs are given in the name line of the sequence written in the FASTA format. |
| • | The length of known promoter sequence upstream of chosen TSS is 200 bp or more; all stored promoter sequences are the same length, 251 bp, where the position 201 corresponds to the TSS, i.e. collected sequences occupy the region [-200 : +51], with the TSS in the position +1, and, thus, present proximal promoters mentioned above. |
| • | An entry corresponds to the gene mapped on the genomic sequences. |
| • | Various alleles of a gene are presented in the database by a single entry. |
| • | Genes with more than one non-allelic copy in the genome as well as paralogous genes are taken as different entries. |
Moreover, 3503 and 4220 promoters with TSS predicted by mapping full-length cDNAs on genomic sequences from Arabidopsis and rice were added to new release of DB.
Totally, 8301 entries of plant promoters are available in current release of PlantProm DB.
PlantProm DB provides the following information.
1. DNA sequence of 576 experimentally verified (annotated) and 7723 mapped promoter regions [-200:+51], with annotated or mapped TSS on the fixed position +201, from various plant species, in the FASTA format, including:
| 1.1. | 150 annotated promoters of monocots |
| 1.2. | 403 annotated promoters of dicots |
| 1.3. | 23 annotated promoters from other plants |
| 1.4. | 3503 mapped promoters from Arabidopsis |
| 1.5. | 4220 mapped promoters from rice |
| 1.6. | 345 annotated TATA promoters, consisting of 84 monocot 256 dicot and 5 other plant species sequences, respectively (with location of TATA-box core-motifs given in capital letters). |
| 1.7. | 873 and 374 mapped TATA promoters from Arabidopsis and rice, respectively (with location of TATA-box core-motifs given in capital letters). |
| 1.8. | 231 annotated TATA-less promoters consisting of 66 monocot 147 dicot and 18 other plant species sequences, respectively. |
| 1.9. | 2669 and 3846 mapped TATA-less promoters from Arabidopsis and rice, respectively. |
2. Taxonomic and promoter type classification of promoters, including:
| 2.1. | Summary of Species and Promoter Classification, |
| 2.2. | Individual Characteristics of Genes/Promoters and Original Data Sources |
3. Nucleotide Frequency Matrices for canonical promoter elements (TATA-box, CCAAT- box, and TSS-motif or Initiator element, Inr), including:
| 3.1. | TATA-matrices for various promoter collections, |
| 3.2. |