File size: 2,029 Bytes
f9676e9
c1fa153
 
f9676e9
 
 
c1fa153
 
 
 
 
5611c9f
f9676e9
 
c1fa153
 
f9676e9
c1fa153
f9676e9
c1fa153
 
 
f9676e9
c1fa153
f9676e9
89ed106
f9676e9
c1fa153
f9676e9
fc97b65
f9676e9
c1fa153
f9676e9
c1fa153
 
 
f9676e9
c1fa153
9d4188e
c1fa153
 
 
 
 
 
 
 
 
 
f9676e9
c1fa153
f9676e9
c1fa153
 
 
 
 
 
 
 
 
 
 
 
 
 
f9676e9
c1fa153
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
library_name: peft
license: llama3.2
base_model: meta-llama/Llama-3.2-1B
tags:
- generated_from_trainer
datasets:
- scitldr
model-index:
- name: Llama-3.2-1B-Summarization-LoRa
  results: []
pipeline_tag: summarization
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Llama-3.2-1B-Summarization-LoRa

This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on the scitldr dataset.
It achieves the following results on the evaluation set:
- Loss: 2.5661

## Model description

Fine-tuned (LoRa) Version of meta-llama/Llama-3.2-1B for Summarization of scientific documents

## Intended uses & limitations

Summarization

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Use paged_adamw_32bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2
- num_epochs: 2
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 2.45          | 0.2008 | 200  | 2.5272          |
| 2.4331        | 0.4016 | 400  | 2.5327          |
| 2.4369        | 0.6024 | 600  | 2.5285          |
| 2.4315        | 0.8032 | 800  | 2.5238          |
| 2.4303        | 1.0040 | 1000 | 2.5181          |
| 2.1077        | 1.2048 | 1200 | 2.5525          |
| 2.0951        | 1.4056 | 1400 | 2.5611          |
| 2.0738        | 1.6064 | 1600 | 2.5591          |
| 2.0539        | 1.8072 | 1800 | 2.5661          |


### Framework versions

- PEFT 0.13.2
- Transformers 4.46.2
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3