File size: 3,057 Bytes
2ccd940
 
 
 
 
 
 
eed12b5
8687e75
 
 
 
 
2ccd940
 
8687e75
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2ccd940
 
933f8c0
 
 
 
 
2ccd940
c058317
 
 
 
2ccd940
4e2eb4d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
license: apache-2.0
language:
  - en
---

# **Introduction**
We introduce LUXIA-21.4B-Alignment, a large language model (LLM) with 21.4 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. 

It's demonstrates unparalleled state-of-the-art performance in models with parameters under 35B, and it also outperformed the 72B model and the 34Bx2 MoE (Mixture of Experts) model. Please refer to the evaluation results table for details.

The luxia-21.4b-alignment model is derived from the luxia-21.4b-instruct model through DPO training, and the luxia-21.4b-instruct model is an SFT trained version of the luxia-21.4b model. We plan to release both the pretrained model and the instruction-tuned model soon.


# **Instruction Fine-tuning Strategy**

### luxia-21.4b

We created the base model by expanding the layers through a passthrough method based on the internlm2-20b-llama model. And to recover the performance of the created model, we conducted continual pretraining.

### luxia-21.4b-instruct model
We utilize state-of-the-art instruction fine-tuning methods including supervised fine-tuning (SFT).

We used a mixture of the following datasets
- c-s-ale/alpaca-gpt4-data
- Open-Orca/SlimOrca
- in-house generated data utilizing Metamath


### luxia-21.4b-alignment model
We utilize state-of-the-art instruction fine-tuning methods including direct preference optimization (DPO).

We used a mixture of the following datasets
- jondurbin/truthy-dpo-v0.1
- abacusai/ARC_DPO_FewShot
- abacusai/HellaSwag_DPO_FewShot

# **Data Contamination Test Results**
We generate our contamination numbers using https://github.com/swj0419/detect-pretrain-code-contamination/tree/master, with internlm2-20b-llama as our reference model. 
luxia-21.4b-alignment-v1.2 has the following results:
| Model                                |  ARC  | MMLU    | TruthfulQA | GSM8K  |
|--------------------------------------|-------|---------|------------|--------|
| **luxia-21.4b-alignment-v1.2**       | 0.00  | 0.07    | 0.13       | 0.34   |

### **Open LLM Leaderboard Evaluation Results**
| Model                                |  ARC  | HellaSwag | MMLU    | TruthfulQA | Winogrande | GSM8K  |
|--------------------------------------|-------|-----------|---------|------------|------------|--------|
| **luxia-21.4b-alignment-v1.2**       | 77.73 |   90.86   | 67.86   | 79.16      | 86.27      | 66.94  |

# **Usage Instructions**

### **How to use**
```python
# pip install transformers==4.35.2
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("saltlux/luxia-21.4b-alignment-v1.2")
model = AutoModelForCausalLM.from_pretrained(
    "saltlux/luxia-21.4b-alignment-v1.2",
    device_map="auto",
    torch_dtype=torch.bfloat16,
)
```

### **License**
- [saltlux/luxia-21.4b-alignment-v1.2](https://huggingface.co/saltlux/luxia-21.4b-alignment-v1.2): apache-2.0


### **Contact Us** ###
Any questions and suggestions are welcomed at the discussion tab.