Identifying Sigma70 Promoters Using Multiple Windowing and Minimal Features
In bacterial DNA, there are specific sequences of nucleotides called promoters that can bind to the RNA polymerase. Sigma70 (σ70) is one of the most important promoter sequences due to it’s presence in most of the DNA regulatory functions. In this paper, we identify the most effective and optimal sequence-based features for prediction of σ70 promoter sequences in a bacterial genome. We used both short-range and long-range DNA sequences in our proposed method. A very small number of effective features are selected from a large number of the extracted features using multi-window of different sizes within the DNA sequences. We call our prediction method iPro70-FMWin and made it freely accessible online via a web application established at http://ipro70.pythonanywhere.com/server for the sake of convenience of the researchers. We have tested our method using a standard benchmark dataset. In the experiments, iPro70-FMWin has achieved an area under the curve of the receiver operating characteristic and accuracy of 0.959 and 90.57% respectively which significantly outperforms the state-of-the-art predictors.