wwe180 commited on
Commit
5b6fbce
1 Parent(s): f41631c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -44
README.md CHANGED
@@ -1,56 +1,46 @@
1
  ---
2
  base_model:
3
- - NousResearch/Hermes-2-Theta-Llama-3-8B
4
- - camillop/Meta-Llama-3-8B-ORPO-ITA-llama-adapters
5
- - gradientai/Llama-3-8B-Instruct-Gradient-1048k
6
- - migtissera/Llama-3-8B-Synthia-v3.5
7
- - unstoppable123/LLaMA3-8B_chinese_lora_sft_v0.2
8
- - openchat/openchat-3.6-8b-20240522
9
- - hfl/llama-3-chinese-8b-instruct-v2-lora
10
- - Sao10K/L3-8B-Stheno-v3.1
11
- - Jiar/Llama-3-8B-Chinese
12
  library_name: transformers
13
  tags:
14
  - mergekit
15
  - merge
16
-
 
 
17
  ---
 
 
 
 
 
 
 
18
  # merge
19
 
20
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
21
 
22
- ## Merge Details
23
- ### Merge Method
24
-
25
- This model was merged using the passthrough merge method using [gradientai/Llama-3-8B-Instruct-Gradient-1048k](https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k) as a base.
26
-
27
- ### Models Merged
28
-
29
- The following models were included in the merge:
30
- * [NousResearch/Hermes-2-Theta-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B) + [camillop/Meta-Llama-3-8B-ORPO-ITA-llama-adapters](https://huggingface.co/camillop/Meta-Llama-3-8B-ORPO-ITA-llama-adapters)
31
- * [migtissera/Llama-3-8B-Synthia-v3.5](https://huggingface.co/migtissera/Llama-3-8B-Synthia-v3.5) + [unstoppable123/LLaMA3-8B_chinese_lora_sft_v0.2](https://huggingface.co/unstoppable123/LLaMA3-8B_chinese_lora_sft_v0.2)
32
- * [openchat/openchat-3.6-8b-20240522](https://huggingface.co/openchat/openchat-3.6-8b-20240522) + [hfl/llama-3-chinese-8b-instruct-v2-lora](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2-lora)
33
- * [Sao10K/L3-8B-Stheno-v3.1](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.1) + [Jiar/Llama-3-8B-Chinese](https://huggingface.co/Jiar/Llama-3-8B-Chinese)
34
-
35
- ### Configuration
36
-
37
- The following YAML configuration was used to produce this model:
38
-
39
- ```yaml
40
- slices:
41
- - sources:
42
- - model: "Sao10K/L3-8B-Stheno-v3.1+Jiar/Llama-3-8B-Chinese"
43
- layer_range: [0, 22]
44
- - sources:
45
- - model: "NousResearch/Hermes-2-Theta-Llama-3-8B+camillop/Meta-Llama-3-8B-ORPO-ITA-llama-adapters"
46
- layer_range: [10, 22]
47
- - sources:
48
- - model: "migtissera/Llama-3-8B-Synthia-v3.5+unstoppable123/LLaMA3-8B_chinese_lora_sft_v0.2"
49
- layer_range: [0, 22]
50
- - sources:
51
- - model: "openchat/openchat-3.6-8b-20240522+hfl/llama-3-chinese-8b-instruct-v2-lora"
52
- layer_range: [10,32]
53
- merge_method: passthrough
54
- base_model: "gradientai/Llama-3-8B-Instruct-Gradient-1048k"
55
- dtype: bfloat16
56
  ```
 
 
 
 
1
  ---
2
  base_model:
3
+ - wwe180/Llama3-18B-lingyang-v1
 
 
 
 
 
 
 
 
4
  library_name: transformers
5
  tags:
6
  - mergekit
7
  - merge
8
+ - Llama3
9
+ license:
10
+ - other
11
  ---
12
+
13
+
14
+
15
+ # After simple testing, the effect is good, stronger than llama-3-8b!
16
+
17
+
18
+
19
  # merge
20
 
21
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
22
 
23
+ ## 💻 Usage
24
+
25
+ ```python
26
+ !pip install -qU transformers accelerate
27
+
28
+ from transformers import AutoTokenizer
29
+ import transformers
30
+ import torch
31
+
32
+ model = "Llama3-18B-lingyang-v1"
33
+ messages = [{"role": "user", "content": "What is a large language model?"}]
34
+
35
+ tokenizer = AutoTokenizer.from_pretrained(model)
36
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
37
+ pipeline = transformers.pipeline(
38
+ "text-generation",
39
+ model=model,
40
+ torch_dtype=torch.float16,
41
+ device_map="auto",
42
+ )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  ```
44
+ ## Statement:
45
+
46
+ Llama3-18B-lingyang-v1 does not represent the views and positions of the model developers We will not be liable for any problems arising from the use of the Llama3-18B-lingyang-v1 open Source model, including but not limited to data security issues, risk of public opinion, or any risks and problems arising from the misdirection, misuse, dissemination or misuse of the model.