To facilitate the investigation of gene promoters, this work presents an integrated system namely GPMiner for identifying promoter regions and annotating regulatory features in user-input sequences.

  • The proposed promoter identification method, whose predictive sensitivity and specificity are both ~80%, incorporates the support vector machine (SVM) with nucleotide composition, over-represented hexamer nucleotides and DNA stability.
  • Additionally, the input sequence also can be analyzed for homogeneity of experimental mammalian promoter sequences.
  • After identifying the promoter regions, the regulatory features such as transcription factor binding sites, CpG islands, tandem repeats, the TATA box, the CCAAT box, the GC box, over-represented oligonucleotides, DNA stability and GC-content are graphically visualized to facilitate the observation of gene promoters.
  • Users can input the gene group for mining the co-occurence of transcription factor binding sites.

Step 1 : Species

Step 2 : Query type

Step 3 : Please paste a query sequence in FASTA format:

Step 4 : Regulatory features for mining
       Annotated Transcriptional Start Site
            - Eponine score threshold >=
       CpG Island (by CpGProD)
       G+C Content
            - Sliding window size = nt
       Transcription Factor Binding Site (by TRANSFAC MATCH)
            - Core score >=
            - Matrix score >=

       TATA-box, CCAAT-box, and GC-box
       Over-Represented (OR) Oligonucleotide
            - Z-score >
            - Minimum occurence
            - Report top OR oligonucleotide

       Tandem Repeat (by Tandem Repeat Finder)
       DNA Stability
            - Sliding window size = nt

      
Bid Lab, Institute of Bioinformatics, National Chiao Tung University , Taiwan.

Contact us:Tzong-Yi Lee with questions or comments