|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- distilbert/distilgpt2 |
|
pipeline_tag: text-classification |
|
library_name: transformers |
|
--- |
|
Model Details |
|
Model Description |
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
<!-- - **Developed by:** Dharshika J V --> |
|
- **Model type:** Causal Language Model (Fine-tuned GPT2) |
|
- **Language(s) (NLP):** English |
|
- **License:** Apache-2. |
|
- **Finetuned from model [optional]:** distilbert/distilgpt2 |
|
|
|
### Direct Use |
|
|
|
This model can answer company-related questions from a dataset of structured Q&A pairs. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
The model only knows what was in your dataset. It may give inaccurate or incomplete answers for questions outside that scope. |
|
It may repeat format-specific patterns or show overfitting if trained too long on a small dataset. |
|
It does not have any awareness of real-time data or external company knowledge. |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
Excel sheet: company_details.xlsx containing Question, Answer, and Split columns. |
|
Number of samples: len(data_sample) (replace with actual number). |
|
|
|
#### Training Hyperparameters |
|
|
|
Epochs: 200 |
|
Batch size: 32 |
|
Optimizer: Adam |
|
Learning rate: 5e-4 |
|
Loss: CrossEntropy (ignoring pad tokens) |
|
|
|
## Evaluation |
|
|
|
Validation |
|
|
|
Dataset split: 80% training, 20% validation |
|
Metrics used: Validation Loss |
|
|
|
## Environmental Impact |
|
|
|
- **Hardware Type:** GPU (e.g., Tesla T4/Colab GPU) |
|
- **Hours used:** Estimate (e.g., ~1 hour) |
|
- **Cloud Provider:** Google Colab |
|
- **Compute Region:** India or Global |
|
|
|
|
|
|