File size: 2,218 Bytes
6598b31
5bae26d
 
 
077ad6a
5bae26d
 
 
 
077ad6a
5bae26d
 
 
6598b31
 
5bae26d
 
 
6598b31
5bae26d
6598b31
5bae26d
6598b31
5bae26d
 
6598b31
5bae26d
 
6598b31
5bae26d
6598b31
5bae26d
 
6598b31
5bae26d
 
 
 
 
6598b31
5bae26d
 
 
6598b31
5bae26d
6598b31
5bae26d
 
 
 
 
6598b31
5bae26d
 
6598b31
5bae26d
 
 
 
6598b31
5bae26d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
library_name: transformers
license: apache-2.0
datasets:
- liswei/Taiwan-Text-Excellence-2B
- liswei/PromptPair-TW
- yentinglin/TaiwanChat
base_model:
- liswei/Taiwan-ELM-1_1B
- apple/OpenELM-1_1B
language:
- zh
pipeline_tag: text-generation
---

<center>
    <img src="https://huggingface.co/liswei/Taiwan-ELM/resolve/main/Taiwan%20ELM%20Logo.jpeg" alt="Efficient LLM for Taiwan">
</center>

> Efficient LLM for Taiwan

# Taiwan ELM

Taiwan ELM is a family of Efficient LLMs for Taiwan base on [apple/OpenELM](https://huggingface.co/apple/OpenELM).
The project aims to provide an efficient model for researchers without access to large-scale computing resources.

The model is trained using a custom fork of [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) on 2B Traditional Chinese tokens and 500K instruction samples.
We will extend the model to train on larger data sets and different base models if there is sufficient demand.

## What is being released?

We release both pre-trained base models and instruction tuned variants with 270M and 1.1B parameters.
Along with the model, datasets used to train the base and instruction-tuned models are also released.

List of released models:
* [Taiwan-ELM-270M](https://huggingface.co/liswei/Taiwan-ELM-270M)
* [Taiwan-ELM-1_1B](https://huggingface.co/liswei/Taiwan-ELM-1_1B)
* [Taiwan-ELM-270M-Instruct](https://huggingface.co/liswei/Taiwan-ELM-270M-Instruct)
* [Taiwan-ELM-1_1B-Instruct](https://huggingface.co/liswei/Taiwan-ELM-1_1B-Instruct)

List of released datasets:
* [liswei/Taiwan-Text-Excellence-2B](https://huggingface.co/datasets/liswei/Taiwan-Text-Excellence-2B)
* [liswei/PromptPair-TW](https://huggingface.co/datasets/liswei/PromptPair-TW)

## Usage Examples

We adapt the LLaMA2 template:
```jinja2
<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_message }} [/INST]
```

The model could be load via `AutoModelForCausalLM` with `trust_remote_code=True`:
```python
taiwanelm_270m = AutoModelForCausalLM.from_pretrained("liswei/Taiwan-ELM-270M", trust_remote_code=True)
```

We also support additional generation methods and speculative generation, please find reference at [OpenELM#usage](https://huggingface.co/apple/OpenELM#usage).