stuser2023 commited on
Commit
de67124
1 Parent(s): 702215c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +104 -0
README.md CHANGED
@@ -1,3 +1,107 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # Model Card for Model ID
6
+
7
+ ## 2024 AIA LLM課程範例
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+ - 基礎模型: meta-llama/Meta-Llama-3-8B-Instruct ( https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
10
+ - 採用4bits精度載入模型權重 (load_in_4bit=True)
11
+ - 使用peft library的LoRA訓練方式,進行fine-tuning:
12
+
13
+ ```python
14
+ lora_alpha = 16
15
+ lora_dropout = 0.1
16
+ lora_r = 8
17
+ ```
18
+
19
+ ### 資料集
20
+ (語料包括:英,中,日,韓) HF連結: https://huggingface.co/datasets/timdettmers/openassistant-guanaco
21
+
22
+ ### 訓練環境
23
+ 使用google colab 免費資源(GPU: T4, 15GB)
24
+
25
+ ### 執行範例
26
+ **1.先確認所需library**
27
+
28
+ ```python
29
+ #確認安裝所需套件
30
+ !pip install -q -U trl transformers accelerate git+https://github.com/huggingface/peft.git
31
+
32
+ #LlamaTokenizer requires the SentencePiece library
33
+ !pip install sentencepiece
34
+ ```
35
+
36
+ **2.下載模型**
37
+
38
+ ```python
39
+ import torch
40
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
41
+
42
+ model_name = "stuser2023/Llama3-8b-finetuned"
43
+
44
+ quantization_config = BitsAndBytesConfig(load_in_4bit=True) #約使用GPU記憶體14.2Gb
45
+
46
+ model = AutoModelForCausalLM.from_pretrained(
47
+ model_name,
48
+ quantization_config=quantization_config,
49
+ device_map={'': 0}, # 設定使用的設備,此處指定為 GPU 0
50
+ trust_remote_code=True,
51
+ )
52
+
53
+ model.config.use_cache = False
54
+ model=model.eval() #把Dropout功能關掉
55
+ ```
56
+
57
+ **3.進行推論(文字生成)**
58
+
59
+ ```python
60
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True, padding=True)
61
+ tokenizer.pad_token = tokenizer.eos_token
62
+
63
+ role = "user" #The possible roles can be: system, user, assistant.
64
+ text = "在未來的2040年,人類社會將進入"
65
+
66
+ input_template = f"""<|begin_of_text|><|start_header_id|>{role}<|end_header_id|>{text}<|eot_id|>"""
67
+ input_ids = tokenizer([input_template], return_tensors="pt",add_special_tokens=False).input_ids.to('cuda')
68
+
69
+ generate_input = {
70
+ "input_ids":input_ids,
71
+ "max_new_tokens":384,
72
+ "do_sample":True,
73
+ "top_k":50,
74
+ "top_p":0.95,
75
+ "temperature":0.3,
76
+ "repetition_penalty":1.3,
77
+ "eos_token_id":tokenizer.eos_token_id,
78
+ "bos_token_id":tokenizer.bos_token_id,
79
+ "pad_token_id":tokenizer.pad_token_id
80
+ }
81
+ generate_ids = model.generate(**generate_input)
82
+ text = tokenizer.decode(generate_ids[0])
83
+ print(text)
84
+ ```
85
+
86
+ ### 目前的生成效果
87
+
88
+ ```python
89
+ '''
90
+ user在未來的2040年,人類社會將進入assistant
91
+
92
+ You want to know what human society will be like in 20 years? Well, I can give you some predictions based on current trends and technological advancements. Here are a few things that might happen:
93
+
94
+ 1. Humans could have colonized other planets: With the help of advanced technology such as space travel vehicles and habitats for humans outside Earth's atmosphere, it is possible that humanity has already begun exploring new worlds by then.
95
+
96
+ 2. Artificial intelligence (AI) would become more prevalent: AI systems continue to improve their ability to perform tasks previously done only by people. In this future world, many jobs may no longer require manual labor or decision-making skills because they'll all be handled automatically through automation.
97
+
98
+ 3. Virtual reality becomes indistinguishable from real life: Advances in virtual reality technologies allow us to experience fully immersive environments with lifelike graphics and sounds so realistic we forget about our physical surroundings!
99
+
100
+ 4. Human lifespan increases significantly due to medical breakthroughs: Thanks to ongoing research into aging-related diseases and treatments, there’s hope that most people will live well over 100 years without any major health issues!assistant
101
+
102
+ I think these changes won't come true until much later than 2039. The first one seems unlikely since establishing colonies elsewhere requires significant resources which aren’t available yet; even if those were accessible now, setting up an entire colony takes time too.
103
+ The second prediction also doesn’t seem very likely given how quickly artificial intelligence improves but still hasn’t replaced every job requiring human skillset entirely – at least not currently!
104
+ As far as VR goes, while progress continues being made towards creating better experiences within them, full immersion isn’t quite here just yet either. People need something else besides visuals alone before forgetting where they’re physically located.
105
+ Lastly regarding longevity increase thanks to medicine advances, though scientists work hard toward finding cures against age related illnesses & improving overall healthcare outcomes, reaching
106
+ '''
107
+ ```