Text Generation
Transformers
PyTorch
Chinese
llama
text-generation-inference
Inference Endpoints
yentinglin commited on
Commit
112e9f0
1 Parent(s): c04334a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -30,16 +30,16 @@ pipeline_tag: text-generation
30
 
31
 
32
  ## Overview
33
- Taiwan-LLaMa is a full parameter fine-tuned model based on LLaMa 2 for traditional chinese applications.
34
 
35
- **Taiwan-LLaMa v1.0** pretrained on over 5 billion tokens and instruction-tuned on over 490k conversations both in traditional chinese.
36
 
37
  ## Demo
38
  A live demonstration of the model can be accessed at [Hugging Face Spaces](https://huggingface.co/spaces/yentinglin/Taiwan-LLaMa2).
39
 
40
  ## Key Features
41
 
42
- 1. **Traditional Chinese Support**: The model is fine-tuned to understand and generate text in Traditional Chinese, making it suitable for Taiwanese culture and related applications.
43
 
44
  2. **Instruction-Tuned**: Further fine-tuned on conversational data to offer context-aware and instruction-following responses.
45
 
@@ -49,8 +49,8 @@ A live demonstration of the model can be accessed at [Hugging Face Spaces](https
49
 
50
 
51
  ## Work in progress
52
- - [ ] **Improved Pretraining**: A refined version of the existing pretraining approach is under development, aiming to enhance model performance.
53
- - [ ] **Extended Model Length**: Utilizing the Rope mechanism, the model's length will be extended from 4k to 8k.
54
 
55
 
56
  ## Taiwanese Culture Examples
@@ -72,7 +72,7 @@ We provide a number of model checkpoints that we trained. Please find them on Hu
72
  |--------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------|
73
  | **Taiwan-LLaMa v1.0** (_better for Taiwanese Culture_) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v1.0" target="_blank">yentinglin/Taiwan-LLaMa-v1.0</a> |
74
  | Taiwan-LLaMa v0.9 (partial instruction set) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v0.9" target="_blank">yentinglin/Taiwan-LLaMa-v0.9</a> |
75
- | Taiwan-LLaMa v0.0 (no Traditional Chinese pretraining) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v0.0" target="_blank">yentinglin/Taiwan-LLaMa-v0.0</a> |
76
 
77
  ## Data
78
 
@@ -80,8 +80,8 @@ Here are some quick links to the datasets that we used to train the models:
80
 
81
  | **Dataset** | **Link** |
82
  |---------------------------------|-------------------------------------------------------------------------------------------------------------------------------|
83
- | **Instruction-tuning** | 🤗 <a href="https://huggingface.co/datasets/yentinglin/traditional_chinese_instructions" target="_blank">yentinglin/traditional_chinese_instructions</a> |
84
- | Traditional Chinese Pretraining | 🤗 <a href="https://huggingface.co/datasets/yentinglin/zh_TW_c4" target="_blank">yentinglin/zh_TW_c4</a> |
85
 
86
 
87
  ## Architecture
@@ -89,12 +89,12 @@ Taiwan-LLaMa is based on LLaMa 2, leveraging transformer architecture, <a href="
89
 
90
  It includes:
91
 
92
- * Pretraining Phase: Pretrained on a vast corpus of over 5 billion tokens, extracted from common crawl in Traditional Chinese.
93
  * Fine-tuning Phase: Further instruction-tuned on over 490k multi-turn conversational data to enable more instruction-following and context-aware responses.
94
 
95
  ## Generic Capabilities on Vicuna Benchmark
96
 
97
- The data is translated into traditional Chinese for evaluating the general capability.
98
 
99
 
100
  <img src="./images/zhtw_vicuna_bench_chatgptbaseline.png" width="700">
@@ -157,7 +157,7 @@ If you use our code, data, or models in your research, please cite this reposito
157
  ```
158
 
159
  ## Collaborate With Us
160
- If you are interested in contributing to the development of Traditional Chinese language models, exploring new applications, or leveraging Taiwan-LLaMa for your specific needs, please don't hesitate to contact us. We welcome collaborations from academia, industry, and individual contributors.
161
 
162
  ## License
163
  The code in this project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
 
30
 
31
 
32
  ## Overview
33
+ Taiwan-LLaMa is a full parameter fine-tuned model based on LLaMa 2 for Traditional Mandarin applications.
34
 
35
+ **Taiwan-LLaMa v1.0** pretrained on over 5 billion tokens and instruction-tuned on over 490k conversations both in traditional mandarin.
36
 
37
  ## Demo
38
  A live demonstration of the model can be accessed at [Hugging Face Spaces](https://huggingface.co/spaces/yentinglin/Taiwan-LLaMa2).
39
 
40
  ## Key Features
41
 
42
+ 1. **Traditional Mandarin Support**: The model is fine-tuned to understand and generate text in Traditional Mandarin, making it suitable for Taiwanese culture and related applications.
43
 
44
  2. **Instruction-Tuned**: Further fine-tuned on conversational data to offer context-aware and instruction-following responses.
45
 
 
49
 
50
 
51
  ## Work in progress
52
+ - [ ] **Improved pretraining**: A refined pretraining process (e.g. more data from Taiwan, training strategies) is under development, aiming to enhance model performance for better Taiwanese culture.
53
+ - [ ] **Extend max length**: Utilizing the Rope mechanism as described in [the paper](https://arxiv.org/abs/2104.09864), the model's length will be extended from 4k to 8k.
54
 
55
 
56
  ## Taiwanese Culture Examples
 
72
  |--------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------|
73
  | **Taiwan-LLaMa v1.0** (_better for Taiwanese Culture_) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v1.0" target="_blank">yentinglin/Taiwan-LLaMa-v1.0</a> |
74
  | Taiwan-LLaMa v0.9 (partial instruction set) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v0.9" target="_blank">yentinglin/Taiwan-LLaMa-v0.9</a> |
75
+ | Taiwan-LLaMa v0.0 (no Traditional Mandarin pretraining) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v0.0" target="_blank">yentinglin/Taiwan-LLaMa-v0.0</a> |
76
 
77
  ## Data
78
 
 
80
 
81
  | **Dataset** | **Link** |
82
  |---------------------------------|-------------------------------------------------------------------------------------------------------------------------------|
83
+ | **Instruction-tuning** | 🤗 <a href="https://huggingface.co/datasets/yentinglin/traditional_mandarin_instructions" target="_blank">yentinglin/traditional_mandarin_instructions</a> |
84
+ | Traditional Mandarin Pretraining | 🤗 <a href="https://huggingface.co/datasets/yentinglin/zh_TW_c4" target="_blank">yentinglin/zh_TW_c4</a> |
85
 
86
 
87
  ## Architecture
 
89
 
90
  It includes:
91
 
92
+ * Pretraining Phase: Pretrained on a vast corpus of over 5 billion tokens, extracted from common crawl in Traditional Mandarin.
93
  * Fine-tuning Phase: Further instruction-tuned on over 490k multi-turn conversational data to enable more instruction-following and context-aware responses.
94
 
95
  ## Generic Capabilities on Vicuna Benchmark
96
 
97
+ The data is translated into traditional mandarin for evaluating the general capability.
98
 
99
 
100
  <img src="./images/zhtw_vicuna_bench_chatgptbaseline.png" width="700">
 
157
  ```
158
 
159
  ## Collaborate With Us
160
+ If you are interested in contributing to the development of Traditional Mandarin language models, exploring new applications, or leveraging Taiwan-LLaMa for your specific needs, please don't hesitate to contact us. We welcome collaborations from academia, industry, and individual contributors.
161
 
162
  ## License
163
  The code in this project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.