Llama-2-7b-hf-IDMGSP / README.md

ernlavr

Update README.md

be44629 9 months ago

preview code

raw

history blame

No virus

4.16 kB

	---
	base_model: meta-llama/Llama-2-7b-hf-adapter
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	- f1
	model-index:
	- name: Llama-2-7b-hf-IDMGSP
	results: []
	license: mit
	datasets:
	- tum-nlp/IDMGSP
	language:
	- da
	library_name: transformers
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Llama-2-7b-hf-IDMGSP

	This model is a LoRA adapter of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the [tum-nlp/IDMGSP](https://huggingface.co/datasets/tum-nlp/IDMGSP) dataset.
	It achieves the following results on the evaluation split:
	- Loss: 0.1450
	- Accuracy: {'accuracy': 0.9759036144578314}
	- F1: {'f1': 0.9758125472411187}

	## Model description

	Model loaded fine-tuned in 4bit quantization mode using LoRA.

	## Intended uses & limitations
	Labels: `0` non-AI generated, `1` AI generated.

	For classifying AI generated text. Code to run the inference

	```python
	import transformers
	import torch
	import datasets
	import numpy as np
	import torch
	from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training, PeftModel, AutoPeftModelForCausalLM, TaskType
	import bitsandbytes as bnb

	class Model():
	def __init__(self, name) -> None:
	# Tokenizer
	self.tokenizer = transformers.LlamaTokenizer.from_pretrained(self.name)
	self.tokenizer.pad_token = self.tokenizer.eos_token
	print(f"Tokenizer: {self.tokenizer.eos_token}; Pad {self.tokenizer.pad_token}")

	# Model
	bnb_config = transformers.BitsAndBytesConfig(
	load_in_4bit = True,
	bnb_4bit_use_double_quant = True,
	bnb_4bit_quant_type = "nf4",
	bnb_4bit_compute_dtype = "bfloat16",
	)
	self.peft_config = LoraConfig(
	task_type=TaskType.SEQ_CLS, r=8, lora_alpha=16, lora_dropout=0.05, bias="none"
	)
	self.model = transformers.LlamaForSequenceClassification.from_pretrained(self.name,
	num_labels=2,
	quantization_config = bnb_config,
	device_map = "auto"
	)
	self.model.config.pad_token_id = self.model.config.eos_token_id

	def predict(self, text):
	inputs = self.tokenize(text)
	outputs = self.model(**inputs)
	logits = outputs.logits
	predictions = torch.argmax(logits, dim=-1)
	return id2label[predictions.item()]
	```


	## Training and evaluation data

	[tum-nlp/IDMGSP](https://huggingface.co/datasets/tum-nlp/IDMGSP) dataset, `classifier_input` subsplit.

	## Training procedure

	### Training hyperparameters

	BitsAndBytes and LoRA config parameters:

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/638f0f9ab0525fa370479467/XI1imFyXmzFjCGCkBYClc.png)

	GPU VRAM Consumption during fine-tuning: 30.6gb

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 5
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1 \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------------------------------:\|:--------------------------:\|
	\| 0.0766 \| 1.0 \| 498 \| 0.1165 \| {'accuracy': 0.9614708835341366} \| {'f1': 0.9612813721780804} \|
	\| 0.182 \| 2.0 \| 996 \| 0.0934 \| {'accuracy': 0.9657379518072289} \| {'f1': 0.9648059816939539} \|
	\| 0.037 \| 3.0 \| 1494 \| 0.1190 \| {'accuracy': 0.9716365461847389} \| {'f1': 0.9710182097973841} \|
	\| 0.0349 \| 4.0 \| 1992 \| 0.1884 \| {'accuracy': 0.96875} \| {'f1': 0.9692326702088224} \|
	\| 0.0046 \| 5.0 \| 2490 \| 0.1450 \| {'accuracy': 0.9759036144578314} \| {'f1': 0.9758125472411187} \|


	### Framework versions

	- Transformers 4.35.0
	- Pytorch 2.0.1
	- Datasets 2.14.6
	- Tokenizers 0.14.1