Domain2GO / pages /User_Guide.py
Erva Ulusoy
domain_locations field name changed: sequence_region
29f5209
import streamlit as st
st.sidebar.markdown('''
# Sections
- [How to use](#how-to-use)
- [Troubleshooting](#troubleshooting)
''', unsafe_allow_html=True)
st.markdown('''
# Domain2GO User Guide
''')
# st.markdown('<p style="font-size:25px; font-weight:bold">How to use</p>', unsafe_allow_html=True)
st.header('How to use', anchor='how-to-use')
st.markdown('<p style="font-size:20px; font-weight:bold">1. Submit your protein sequence</p>', unsafe_allow_html=True)
st.markdown(
'''
You can submit your protein sequence by pasting it into the text box or uploading a FASTA file.
Domain2GO only accepts a single protein sequence at a time due to the extended runtime of InterProScan. If you need predictions for multiple UniProtKB/Swiss-Prot proteins, we recommend utilizing our comprehensive protein function prediction dataset available in our [Github repository](https://github.com/HUBioDataLab/Domain2GO).
An example query sequence can be used by clicking the "Use example sequence" button below the input text box.
This sequence is also given below:
```
>sp|O18783|PLMN_NOTEU
MEYGKVIFLFLLFLKSGQGESLENYIKTEGASLSNSQKKQFVASSTEECEALCEKETEFVCRSFEHYNKEQKCVIMSENSKTSSVERKRDVVLFEKRIYLSDCKSGNGRNYRGTLSKTKSGITCQKWSDLSPHVPNYAPSKYPDAGLEKNYCRNPDDDVKGPWCYTTNPDIRYEYCDVPECEDECMHCSGENYRGTISKTESGIECQPWDSQEPHSHEYIPSKFPSKDLKENYCRNPDGEPRPWCFTSNPEKRWEFCNIPRCSSPPPPPGPMLQCLKGRGENYRGKIAVTKSGHTCQRWNKQTPHKHNRTPENFPCRGLDENYCRNPDGELEPWCYTTNPDVRQEYCAIPSCGTSSPHTDRVEQSPVIQECYEGKGENYRGTTSTTISGKKCQAWSSMTPHQHKKTPDNFPNADLIRNYCRNPDGDKSPWCYTMDPTVRWEFCNLEKCSGTGSTVLNAQTTRVPSVDTTSHPESDCMYGSGKDYRGKRSTTVTGTLCQAWTAQEPHRHTIFTPDTYPRAGLEENYCRNPDGDPNGPWCYTTNPKKLFDYCDIPQCVSPSSFDCGKPRVEPQKCPGRIVGGCYAQPHSWPWQISLRTRFGEHFCGGTLIAPQWVLTAAHCLERSQWPGAYKVILGLHREVNPESYSQEIGVSRLFKGPLAADIALLKLNRPAAINDKVIPACLPSQDFMVPDRTLCHVTGWGDTQGTSPRGLLKQASLPVIDNRVCNRHEYLNGRVKSTELCAGHLVGRGDSCQGDSGGPLICFEDDKYVLQGVTSWGLGCARPNKPGVYVRVSRYISWIEDVMKNN
```
If you choose to upload a FASTA file, please make sure that the contents of the file also follow the format shown above.
Please enter your email address in the text box below the sequence input box. InterProScan requests your email to notify you when your job is done. Your email will not be used for any other purpose.
''')
st.markdown('<p style="font-size:20px; font-weight:bold">2. Wait for your results</p>', unsafe_allow_html=True)
st.markdown(
'''
After submitting your protein sequence, Domain2GO will run InterProScan to find domains in your protein. This step may take a few minutes to complete. After domains are found, Domain2GO will predict functions of your protein by assigning functions that are associated with these domains in the Domain2GO mapping set.
''')
st.markdown('<p style="font-size:20px; font-weight:bold">3. View your results</p>', unsafe_allow_html=True)
st.markdown(
'''
You can view the predicted functions by clicking the "Show function predictions" button. The results will be displayed in a table with the following columns:
| Column name | Description |
| ------------- | ------------- |
| protein_name | Protein name you provided in the input FASTA. |
| GO_ID | Gene Ontology term ID. |
| GO_term | Gene Ontology term name. |
| GO_category | Gene Ontology term aspect (molecular_function, biological_process or cellular_component). |
| sequence_region | List of locations of the domain in the protein sequence. |
| probability | Probability of the domain being associated with the GO term. You can find more information about calculation of this score in our [pre-print article](https://www.biorxiv.org/content/10.1101/2022.11.03.514980v1) Chapter 2.2. |
| domain_accession | InterPro domain accession. |
| domain_name | InterPro domain name. |
''')
st.markdown(
'''
''')
st.markdown(
'''
You can download the results as a CSV file by clicking the "Download function predictions as CSV" button.
''')
# st.markdown('<p style="font-size:20px; font-weight:bold">4. Troubleshooting</p>', unsafe_allow_html=True)
st.header('Troubleshooting', anchor='troubleshooting')
st.markdown(
'''
Please check the following table for possible warning/error messages that can be displayed on the main page and their descriptions.
| Warning message | Description |
| ------------- | ------------- |
| 'No domains found.' | InterProScan did not find any domains in your protein sequence. If you are sure that your protein has domains, please check that your protein sequence is in a valid FASTA format. |
| Errors about InterProScan | InterProScan job failed. Your InterProScan job ID together with the error message returned by InterProScan is displayed on the main page. Please check this message or query the status of your InterProScan job by giving it to following URL: https://www.ebi.ac.uk/Tools/services/rest/iprscan5/status/{job_id} |
| 'No predictions made for domains found in sequence.' | Domains in your protein sequence are not associated with any GO terms in our mapping set. |
''')
st.markdown(
'''
''')
st.markdown(
'''
If you have any questions or encounter any problems, please create an issue in our [Github repository](https://github.com/HUBioDataLab/Domain2GO/issues) or open a discussion in our [HuggingFace space](https://huggingface.co/spaces/HUBioDataLab/Domain2GO/discussions).
''')