|
--- |
|
license: apache-2.0 |
|
base_model: Qwen/Qwen1.5-0.5B-Chat |
|
tags: |
|
- trl |
|
- orpo |
|
model-index: |
|
- name: Qwen-Orpo-v1 |
|
results: [] |
|
--- |
|
|
|
## FINGU-AI/Qwen-Orpo-v1 |
|
|
|
### Overview |
|
|
|
The FINGU-AI/Qwen-Orpo-v1 model offers a specialized curriculum tailored to English, speakers interested in finance, investment, and legal frameworks. It aims to enhance language proficiency while providing insights into global finance markets and regulatory landscapes. |
|
|
|
### Key Features |
|
|
|
- **Global Perspective**: Explores diverse financial markets and regulations across English, Korean, and Japanese contexts. |
|
- **Language Proficiency**: Enhances language skills in English, Korean, and Japanese for effective communication in finance and legal domains. |
|
- **Career Advancement**: Equips learners with knowledge and skills for roles in investment banking, corporate finance, asset management, and regulatory compliance. |
|
|
|
### Model Information |
|
|
|
- **Model Name**: FINGU-AI/Qwen-Orpo-v1 |
|
- **Description**: FINGU-AI/Qwen-Orpo-v1 model trained on various languages, including English. |
|
- **Checkpoint**: FINGU-AI/Qwen-Orpo-v1 |
|
- **Author**: Grinda AI Inc. |
|
- **License**: Apache-2.0 |
|
|
|
### Training Details |
|
|
|
- **Fine-Tuning**: The model was fine-tuned on the base model Qwen/Qwen1.5-0.5B-Chat through ORPO fine-tuning using the TrL Library and Transformer. |
|
- **Dataset**: The fine-tuning dataset consisted of 28k training samples. |
|
|
|
|
|
### How to Use |
|
|
|
To use the FINGU-AI/Qwen-Orpo-v1 model, you can utilize the Hugging Face Transformers library. |
|
Here's a Python code snippet demonstrating how to load the model and generate predictions: |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig,TextStreamer |
|
|
|
|
|
model_id = 'FINGU-AI/Qwen-Orpo-v1' |
|
model = AutoModelForCausalLM.from_pretrained(model_id, attn_implementation="flash_attention_2", torch_dtype= torch.bfloat16) |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
streamer = TextStreamer(tokenizer) |
|
model.to('cuda') |
|
|
|
|
|
|
|
messages = [ |
|
{"role": "system","content": " you are as a finance specialist, help the user and provide accurat information."}, |
|
{"role": "user", "content": " what are the best approch to prevent loss?"}, |
|
] |
|
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda") |
|
|
|
generation_params = { |
|
'max_new_tokens': 1000, |
|
'use_cache': True, |
|
'do_sample': True, |
|
'temperature': 0.7, |
|
'top_p': 0.9, |
|
'top_k': 50, |
|
} |
|
|
|
outputs = model.generate(tokenized_chat, **generation_params, streamer=streamer) |
|
decoded_outputs = tokenizer.batch_decode(outputs) |
|
|
|
``` |