|
--- |
|
license: mit |
|
base_model: microsoft/deberta-v3-small |
|
tags: |
|
- regression |
|
model-index: |
|
- name: deberta-v3-small-sp500-edgar-10k-markdown-1024-vN |
|
results: [] |
|
datasets: |
|
- BEE-spoke-data/sp500-edgar-10k-markdown |
|
language: |
|
- en |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# pszemraj/deberta-v3-small-sp500-edgar-10k |
|
|
|
|
|
this predicts the `ret` column of the training dataset, given the `text` column. |
|
|
|
|
|
|
|
<details> |
|
|
|
<summary>Click to expand code example</summary> |
|
|
|
```py |
|
import json |
|
|
|
import numpy as np |
|
import torch |
|
from huggingface_hub import hf_hub_download |
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
|
# Define the model repository on Hugging Face Hub |
|
model_repo_name = "pszemraj/deberta-v3-small-sp500-edgar-10k" |
|
|
|
# Download the regression_config.json file |
|
regression_config_path = hf_hub_download( |
|
repo_id=model_repo_name, filename="regression_config.json" |
|
) |
|
|
|
# Load regression configuration |
|
with open(regression_config_path, "r") as f: |
|
regression_config = json.load(f) |
|
|
|
# Load the tokenizer and model |
|
tokenizer = AutoTokenizer.from_pretrained(model_repo_name) |
|
model = AutoModelForSequenceClassification.from_pretrained(model_repo_name) |
|
|
|
|
|
# Function to apply inverse scaling to a prediction |
|
def inverse_scale(prediction, config): |
|
min_value, max_value = config["min_value"], config["max_value"] |
|
return prediction * (max_value - min_value) + min_value |
|
|
|
|
|
# Example of using the model for inference |
|
def predict(text, tokenizer, model, config, ndigits=4): |
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
logits = outputs.logits |
|
predictions = logits.numpy() |
|
# Assuming regression task, apply inverse scaling |
|
scaled_predictions = [inverse_scale(pred[0], config) for pred in predictions] |
|
return round(scaled_predictions[0], ndigits) |
|
|
|
|
|
# Example text |
|
text = "This is an example text for regression prediction." |
|
|
|
# Get predictions |
|
predictions = predict(text, tokenizer, model, regression_config) |
|
print("Predicted Value:", predictions) |
|
``` |
|
|
|
</details> |
|
|
|
## Model description |
|
|
|
This model is a fine-tuned version of [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small) on BEE-spoke-data/sp500-edgar-10k-markdown |
|
|
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.0005 |
|
- Mse: 0.0005 |
|
|
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 4 |
|
- eval_batch_size: 4 |
|
- seed: 30826 |
|
- gradient_accumulation_steps: 16 |
|
- total_train_batch_size: 64 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_ratio: 0.05 |
|
- num_epochs: 3.0 |
|
- mixed_precision_training: Native AMP |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Mse | |
|
|:-------------:|:-----:|:----:|:---------------:|:------:| |
|
| 0.0064 | 0.54 | 50 | 0.0006 | 0.0006 | |
|
| 0.0043 | 1.08 | 100 | 0.0005 | 0.0005 | |
|
| 0.0028 | 1.61 | 150 | 0.0006 | 0.0006 | |
|
| 0.0025 | 2.15 | 200 | 0.0005 | 0.0005 | |
|
| 0.0025 | 2.69 | 250 | 0.0005 | 0.0005 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.38.0.dev0 |
|
- Pytorch 2.2.0+cu121 |
|
- Datasets 2.16.1 |
|
- Tokenizers 0.15.2 |