license: mit
language:
- en
datasets:
- onurkeles/econ_paper_abstracts
LLaMA-2-Econ: Abstract Classification Model
Model Description
This Abstract Classification Model, derivative of the LLaMA-2-7B model, is fine-tuned specifically for classifying abstracts from economic research papers. By using techniques such as Quantized Low Rank Adaptation (QLoRA) and Parameter Efficient Fine Tuning (PEFT), this model aims to categorize economic papers into their respective subfields, such as econometrics, general economics, and theoretical economics.
Intended Uses & Limitations
The primary use of this model is to support academic researchers, librarians, and database managers in organizing and categorizing economic research papers more effectively. It can automatically assign relevant categories to papers based on their abstracts, reducing the manual workload and improving the accessibility of economic literature.
Limitations:
- The accuracy of classification may decrease for abstracts that cover interdisciplinary topics or do not clearly align with defined categories.
- The model's performance is dependent on the diversity and representativeness of the training data. Abstracts outside the scope of the training dataset may not be classified accurately.
Training and Evaluation Data
The model was trained on a diverse dataset of economic paper abstracts sourced through the arXiv API. This dataset includes papers from a broad range of economic subfields, ensuring comprehensive coverage and fostering the model's ability to generalize across various topics within economics. Each paper in the dataset is manually categorized into one of several predefined categories, such as econometrics, general economics, or theoretical economics, to provide a ground truth for training and evaluation.
Training Procedure
The model underwent a fine-tuning process utilizing a combination of QLoRA and PEFT to optimize performance while minimizing computational requirements. The training focused on enhancing the model's ability to discern subtle distinctions between different economic subfields based on the content of the abstracts.
Training Hyperparameters:
- QLoRA Settings:
lora_rank (lora_r)
: 64lora_dropout
: 0.1
- Precision & Quantization:
- Precision: 4-bit
- Computation dtype: float16
- Quantization type: "nf4", with nested quantization
- Training Schedule:
- Epochs: 8, with early stopping patience of 2 epochs for efficiency
- bf16 training enabled
- Optimizer & Learning Rate:
- Optimizer: paged AdamW with 32-bit precision
- Learning rate: 2e-4, using a cosine learning rate scheduler
- Warmup ratio: 0.03
- Additional Settings:
- Gradient checkpointing and a maximum gradient norm of 0.3
- Sequences grouped by length for training efficiency
- PEFT adapters merged into the baseline models for enhanced performance
Evaluation Results
- Accuracy: 0.88
- Precision: 0.88
- Recall: 0.88
- F1 Score: 0.88
Citation
- Keleş, O. & Bayraklı, Ö. T. (Fortcoming 2024, May). LLaMA-2-Econ: Enhancing Title Generation, Classification, and Academic Q&A in Economic Research. To be presented in LREC-COLING 2024, 4th Workshop on ECONLP: Turin, Italy.