CatMemo: Fine-Tuning Large Language Models for Financial Applications

Model Overview

This model, CatMemo, is fine-tuned using Data Fusion techniques for financial applications. It was developed as part of the FinLLM Challenge Task and focuses on enhancing the performance of large language models in finance-specific tasks such as question answering, document summarization, and sentiment analysis.

Key Features

  • Fine-tuned on financial datasets using Supervised Fine-Tuning (SFT) techniques.
  • Optimized for Transfer Reinforcement Learning (TRL) workflows.
  • Specialized for tasks requiring domain-specific context in financial applications.

Usage

You can use this model with the Hugging Face Transformers library to perform financial text analysis. Below is a quick example:

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the model and tokenizer
model_name = "zeeshanali01/cryptotunned"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Tokenize input
inputs = tokenizer("What are the key takeaways from the latest earnings report?", return_tensors="pt")

# Generate output
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

This model was fine-tuned using Data Fusion methods on domain-specific financial datasets. The training pipeline includes:

  • Preprocessing financial documents and datasets to enhance model understanding.
  • Applying Supervised Fine-Tuning (SFT) to optimize the model for financial NLP tasks.
  • Testing and evaluation on FinLLM benchmark tasks.

Citation

If you use this model, please cite our work:

@inproceedings{cao2024catmemo,
  title={CatMemo at the FinLLM Challenge Task: Fine-Tuning Large Language Models using Data Fusion in Financial Applications},
  author={Cao, Yupeng and Yao, Zhiyuan and Chen, Zhi and Deng, Zhiyang},
  booktitle={Joint Workshop of the 8th Financial Technology and Natural Language Processing (FinNLP) and the 1st Agent AI for Scenario Planning (AgentScen) in conjunction with IJCAI 2023},
  pages={174},
  year={2024}
}

License

This model is licensed under the Apache 2.0 License. See the LICENSE file for details.

Acknowledgments

We thank the organizers of the FinLLM Challenge Task for providing the benchmark datasets and tasks used to develop this model.


Model Card Metadata

  • License: Apache 2.0
  • Tags: TRL, SFT
  • Library Used: Transformers
Downloads last month
45
Safetensors
Model size
3.89B params
Tensor type
F32
·
U8
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.