1) What is RiboD?
=> RiboD is a database for prokaryotic riboswitches developed by the RNA, Origins and Complexity Group at the Department of Physical Sciences, Indian Institute of Science Education and Research Kolkata. RiboD is a web based searchable database which provides comprehensive information on prokaryotic riboswitches and riboswitch regulated genes/operons on a single platform. The riboswitches in this database are mainly predicted based on Rfam riboswitch-class specific CM and the predictions are further validated with the pHMM-based prediction algorithm used in Riboswitch Scanner.
2) What information can I obtain from the database?
=> The current version of the database compiles information from 1777 prokaryotic genomes and covers 31 metabolite and ion sensing riboswitch classes. The database includes information on riboswitches and corresponding riboswitch regulated genes/operons. The database allows users to submit a query based on genome name, riboswitch class, riboswitch regulated gene or protein name and taxonomy. In addition, you can get detailed information about the predicted biological process (gene ontology) the riboswitch regulated genes are involved in. Furthermore, all of the computationally identified tandem riboswitches are provided in the "Tandem Riboswitches" option under the search tab. The entries are also linked to the NCBI for additional information.
3) How will users be benefited from the RiboD database?
=> RiboD will be helpful for a wide variety of researchers. RiboD provides all necessary information about prokaryotic riboswitches which are required prior to designing an experiment such as riboswitch location, gene(s)/operon it regulates together with the genomic location of those genes. In addition, we believe that the data provided in RiboD will be quite useful for scientific analysis involving riboswitch distribution, patterns of gene regulation, evolution, horizontal riboswitch transfer study, etc.
4) Are the RiboD riboswitches verified by experiments or are they based on computational prediction?
=> Most of the riboswitches in RiboD are not verified by experiments because our computational analysis predicts a large number of riboswitches, making experimental verification of each prediction unviable. All of the riboswitches and riboswitch regulated genes/operons are predicted computationally based on Rfam provided covariance model and additionally verified with a pHMM-based model used in the Riboswitch Scanner web-server developed by us. The reliability of our computational predictions (of riboswitches) is supported by several statistical tests outlined in our publications.
5) How is RiboD different from the Rfam database?
=> In Rfam, the user can only access the CM predicted genomic locations of riboswitches but can't get the information about riboswitch regulated genes/operons. RiboD considers the CM predicted genomic locations of riboswitches as well as the associated genes/operons from the complete genome sequences available in the RefSeq database. Moreover, the stringent search filters available in RiboD allow the users to extract detailed information about the corresponding riboswitches as per user needs.
6) How do I submit a query in the RiboD database?
=> The RiboD server provides multiple options to search for riboswitches.
Searching based on riboswitch classes: You can search for riboswitches by class, from the 31 metabolite and ion-sensing riboswitches provided in the "Riboswitch Class" option under the search tab.
Searching based on taxonomy: You can search for riboswitches and riboswitch-regulated genes/operons from the bacterial and archaeal taxonomy order provided in the "Taxonomy" option under the Search tab.
Searching based on riboswitch-regulated genes: When a specific gene name is used as a search query, the server provides detailed information about all riboswitches found upstream to the specified gene in all organisms curated in the RiboD database.
Searching based on annotated biological processes (including the pathways) of riboswitch-regulated genes: You can search by annotated biological processes of riboswitch-regulated genes, from the 78 biological processes provided in the "pathways/biological process" option under the search tab.
Searching tandem riboswitches: All of the computationally identified tandem riboswitches are provided in the "Tandem Riboswitches" option under the search tab.
Searching based on genomes: You can search the riboswitches present in a specific genome of interest from the "Genomes" tab.
Advanced search: In the advanced search tab, the user can specify multiple fields to refine the search query and extract the specific information.
7) How I can find the tandem riboswitches?
=> You can find the list of genome-wide computationally identified tandem riboswitches by selecting the "Tandem Riboswitches" option under the search tab.
8) What kind of information needs to be provided by the user for prediction of riboswitches from the “Predict” tab?
=> If the bacterial/archeal genome that the user is interested in searching, is not available in the RiboD database, the user can upload the nucleotide sequence in NCBI fna file format and genomic features in NCBI gff file format in the "Predict" tab. The presence/absence of the relevant riboswitch(es) are based on Riboswitch Scanner web-server. If the corresponding riboswitch(es) are present in the genome sequence uploaded, information about the riboswitch and corresponding riboswitch-regulated gene will be extracted and provided in the output page.
The user can also use the Riboswitch Scanner web-server to directly search for riboswitches based on a user-provided partial/complete genome sequence in FASTA format.
9) How does the Riboswitch prediction feature in RiboD differ from the one in Riboswitch Scanner?
=> The prediction feature in RiboD provides more information on the predicted riboswitches that includes not only the genomic location of the riboswitches but also the identifier (name/locus_tag) and genomic location of the gene/operon it regulates. However it also needs more input information since the user has to provide the genome sequence file as well as the genomic feature file in gff format. In order to search for riboswitches across complete genomes (for which both the sequence and gff files are available), it is advisable to use the predict feature of RiboD. To quickly search for riboswitches from partial genome sequences or complete genome sequences for which the corresponding gff file is not available, it is advisable to use the Riboswitch Scanner.
10) When I search by riboswitch regulated gene name in the database, I don’t get any results even though I am sure the gene exists?
=> This can happen if the gene is not annotated by that name in the gff file. In RiboD, for each genome, riboswitch-regulated gene information is extracted from the gff file provided in the RefSeq. In the gff file, wherever the queried gene name is not specified, the RefSeq locus_tag is given as the gene name. Hence, in many cases, it may not be possible to find the riboswitch information based on the user-provided gene-name query. We will consider this issue in our next version of this database. If you did not find the result while searching on the basis of a riboswitch-regulated gene, then try searching with genome-name instead. That will list all the riboswitches and riboswitch regulated genes/operons within that genome. From the output, you can find that all of the riboswitch-regulated gene/operon information are linked to the NCBI from where you can find more detailed information about the corresponding gene/operon.
11) Why does the some of the entries in the column titled “Downstream gene name/gene product description” contain the gene name while others contain just the locus_tag?
=> Whenever a gene annotation is available together with the gene-product description, the corresponding row in the above column lists the annotated gene name followed by the gene product description. However, in many cases, the gene is not annotated and known only by the locus_tag. In such cases, the locus_tag is used as the gene name. In cases, where the gene product description is not available, the corresponding row in the above column lists just the locus_tag.
12) Why is it that sometimes, the downstream gene regulated by the riboswitch is sometimes identified by its locus_tag and sometimes by its name?
=> In all cases, where the gene annotations exist, the downstream gene is identified by the name which is highlighted in purple in the Downstream gene name/gene product description column on the output page. This is also true for riboswitch regulated genes that are part of an operon. The gene composition of the operon is given in terms of both locus tag and gene name highlighted in purple colour (whenever gene annotations are available) but only in terms of the gene locus tag when gene annotations are not available. For example, the entry in the Operon Details column (AM1_3001, AM1_3002 (thiC)) implies that the riboswitch regulates an operon made up of two genes AMI_3001 (whose gene annotation is not available) and AMI_3002 which is annotated as the thiC gene.
13) Can I search for riboswitches by specifying the biological process/pathway only?
=> Yes, under the “Search” tab, there is an option to search for riboswitches on the basis of specific biological processes (Gene Ontology) including the pathway that have been annotated. Choosing that option will take you to the list of all annotated biological processes from where you can select the ones you are interested in.
14) When I search with keywords for specific search fields under the “Advanced Search” option I get a page saying, “No data exists”.
=> It is possible that that one or more of the keywords you used were incorrectly spelt or incomplete. This can sometimes (but not always) lead to a page with no output data. Please ensure that the keywords (especially relevant for those associated with annotated biological processes) match the one listed in the appropriate links listing the different keywords for Riboswitch Class, Biological Process and Phylum.
15) How are the database provided riboswitch structures predicted?
=> Structures were generated using a combination of covariance model alignment and energy minimization methods. First, the sequences were aligned to a covariance model taken from Rfam using cmsearch from the Infernal package. The aligned consensus structure was then passed to RNAfold as enforced structure constraints to attain a local minimum energy structure. VARNA was used to generate the image. We highlight base pairs from the covariance model alignment using thick red lines while the additional minimum energy base pairs are denoted by blue lines. The user can click the view structure option in the predicted structure column to view the structure.
16) Can I download the database?
=> Yes, you can download the results of each of your search query in excel file format.
References
1. Mukherjee, S., Mandal, S.D., Gupta, N., Retwitzer, M.D., Barash, D., and Sengupta, S. (2019) "RiboD: A comprehensive database for prokaryotic riboswitches", Bioinformatics, btz093.
2. Mukherjee, S. and Sengupta, S. (2016) "Riboswitch Scanner: An efficient pHMM-based web-server to detect riboswitches in genomic sequences", Bioinformatics, 32 (5): 776-778.
3. Singh, P., Bandyopadhyay, P., Bhattacharya, S., Krishnamachari, A. and Sengupta, S. (2009) "Riboswitch detection using profile hidden Markov models", BMC Bioinformatics 10 (1): 325.