Spaces:
				
			
			
	
			
			
		Sleeping
		
	
	
	
			
			
	
	
	
	
		
		
		Sleeping
		
	
		Himanshu Mohanty
		
	commited on
		
		
					Create README.md
Browse files
    	
        README.md
    ADDED
    
    | @@ -0,0 +1,101 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            # TOSRoberta: Terms of Service Analyzer ππ€
         | 
| 2 | 
            +
             | 
| 3 | 
            +
            [](https://github.com/HimanshuMohanty-Git24/TOSRoberta/blob/main/LICENSE)
         | 
| 4 | 
            +
            [](https://github.com/HimanshuMohanty-Git24/TOSRoberta/stargazers)
         | 
| 5 | 
            +
            [](https://github.com/HimanshuMohanty-Git24/TOSRoberta/network)
         | 
| 6 | 
            +
            [](https://github.com/HimanshuMohanty-Git24/TOSRoberta/issues)
         | 
| 7 | 
            +
             | 
| 8 | 
            +
            TOSRoberta is an advanced Terms of Service (ToS) analyzer powered by a fine-tuned RoBERTa-large model. It classifies clauses in ToS documents based on their fairness level, helping users quickly identify potentially unfair terms.
         | 
| 9 | 
            +
             | 
| 10 | 
            +
            
         | 
| 11 | 
            +
             | 
| 12 | 
            +
             | 
| 13 | 
            +
            ## π Features
         | 
| 14 | 
            +
             | 
| 15 | 
            +
            - π Analyzes ToS documents and classifies clauses into three categories:
         | 
| 16 | 
            +
              - β
 Clearly Fair
         | 
| 17 | 
            +
              - β οΈ Potentially Unfair
         | 
| 18 | 
            +
              - β Clearly Unfair
         | 
| 19 | 
            +
            - π Supports both PDF and text file uploads
         | 
| 20 | 
            +
            - π» User-friendly web interface built with Streamlit
         | 
| 21 | 
            +
            - π§  Powered by a fine-tuned RoBERTa-large model (CodeHima/Tos-Roberta)
         | 
| 22 | 
            +
             | 
| 23 | 
            +
            ## π Model Performance
         | 
| 24 | 
            +
             | 
| 25 | 
            +
            Our Tos-Roberta model demonstrates strong performance on the task of ToS clause classification:
         | 
| 26 | 
            +
             | 
| 27 | 
            +
            - **Validation Accuracy**: 89.64%
         | 
| 28 | 
            +
            - **Test Accuracy**: 85.84%
         | 
| 29 | 
            +
             | 
| 30 | 
            +
            Detailed performance metrics per epoch:
         | 
| 31 | 
            +
             | 
| 32 | 
            +
            | Epoch | Training Loss | Validation Loss | Accuracy | F1 Score | Precision | Recall   |
         | 
| 33 | 
            +
            |-------|---------------|-----------------|----------|----------|-----------|----------|
         | 
| 34 | 
            +
            | 1     | 0.443500      | 0.398950        | 0.874699 | 0.858838 | 0.862516  | 0.874699 |
         | 
| 35 | 
            +
            | 2     | 0.416400      | 0.438409        | 0.853012 | 0.847317 | 0.849916  | 0.853012 |
         | 
| 36 | 
            +
            | 3     | 0.227700      | 0.505879        | 0.896386 | 0.893325 | 0.891521  | 0.896386 |
         | 
| 37 | 
            +
            | 4     | 0.052600      | 0.667532        | 0.891566 | 0.893167 | 0.895115  | 0.891566 |
         | 
| 38 | 
            +
            | 5     | 0.124200      | 0.747090        | 0.884337 | 0.887412 | 0.891807  | 0.884337 |
         | 
| 39 | 
            +
             | 
| 40 | 
            +
            ## π Project Structure
         | 
| 41 | 
            +
             | 
| 42 | 
            +
            ```
         | 
| 43 | 
            +
            tos-analyzer/
         | 
| 44 | 
            +
            β
         | 
| 45 | 
            +
            βββ app.py
         | 
| 46 | 
            +
            βββ requirements.txt
         | 
| 47 | 
            +
            βββ utils/
         | 
| 48 | 
            +
            β   βββ __init__.py
         | 
| 49 | 
            +
            β   βββ text_processing.py
         | 
| 50 | 
            +
            β   βββ model_utils.py
         | 
| 51 | 
            +
            βββ README.md
         | 
| 52 | 
            +
            ```
         | 
| 53 | 
            +
             | 
| 54 | 
            +
            ## π οΈ Installation
         | 
| 55 | 
            +
             | 
| 56 | 
            +
            1. Clone the repository:
         | 
| 57 | 
            +
               ```
         | 
| 58 | 
            +
               git clone https://github.com/HimanshuMohanty-Git24/TOSRoberta.git
         | 
| 59 | 
            +
               cd TOSRoberta
         | 
| 60 | 
            +
               ```
         | 
| 61 | 
            +
             | 
| 62 | 
            +
            2. Install the required dependencies:
         | 
| 63 | 
            +
               ```
         | 
| 64 | 
            +
               pip install -r requirements.txt
         | 
| 65 | 
            +
               ```
         | 
| 66 | 
            +
             | 
| 67 | 
            +
            3. Run the Streamlit app:
         | 
| 68 | 
            +
               ```
         | 
| 69 | 
            +
               streamlit run app.py
         | 
| 70 | 
            +
               ```
         | 
| 71 | 
            +
             | 
| 72 | 
            +
            ## π Training Visualization
         | 
| 73 | 
            +
             | 
| 74 | 
            +
            We used Weights & Biases for monitoring the training process. Here's a glimpse of our training metrics:
         | 
| 75 | 
            +
             | 
| 76 | 
            +
            
         | 
| 77 | 
            +
             | 
| 78 | 
            +
             | 
| 79 | 
            +
            ## π€ Contributing
         | 
| 80 | 
            +
             | 
| 81 | 
            +
            Contributions are welcome! Please feel free to submit a Pull Request.
         | 
| 82 | 
            +
             | 
| 83 | 
            +
            ## π License
         | 
| 84 | 
            +
             | 
| 85 | 
            +
            This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
         | 
| 86 | 
            +
             | 
| 87 | 
            +
            ## π Acknowledgements
         | 
| 88 | 
            +
             | 
| 89 | 
            +
            - [Hugging Face](https://huggingface.co/) for the Transformers library
         | 
| 90 | 
            +
            - [Streamlit](https://streamlit.io/) for the easy-to-use web app framework
         | 
| 91 | 
            +
            - [Weights & Biases](https://wandb.ai/) for experiment tracking
         | 
| 92 | 
            +
             | 
| 93 | 
            +
            ## π¬ Contact
         | 
| 94 | 
            +
             | 
| 95 | 
            +
            Himanshu Mohanty - [CodingHima](https://x.com/CodingHima) - codehimanshu24@gmail.com
         | 
| 96 | 
            +
             | 
| 97 | 
            +
            Project Link: [https://github.com/HimanshuMohanty-Git24/TOSRoberta](https://github.com/HimanshuMohanty-Git24/TOSRoberta)
         | 
| 98 | 
            +
             | 
| 99 | 
            +
            ---
         | 
| 100 | 
            +
             | 
| 101 | 
            +
            βοΈ If you find this project useful, please consider giving it a star!
         | 
