To facilitate the investigation of gene promoters, this work presents an integrated system namely GPMiner for identifying promoter regions and annotating regulatory features in user-input sequences.

  • The proposed promoter identification method, whose predictive sensitivity and specificity are both ~80%, incorporates the support vector machine (SVM) with nucleotide composition, over-represented hexamer nucleotides and DNA stability.
  • Additionally, the input sequence also can be analyzed for homogeneity of experimental mammalian promoter sequences.
  • After identifying the promoter regions, the regulatory features such as transcription factor binding sites, CpG islands, tandem repeats, the TATA box, the CCAAT box, the GC box, over-represented oligonucleotides, DNA stability and GC-content are graphically visualized to facilitate the observation of gene promoters.
  • Users can input the gene group for mining the co-occurence of transcription factor binding sites.

